Skip to main content
#
Full text of "Schaum's Theory & Problems of Linear Algebra"

SCHAUM'S OUTLINE SERIES THEORV AND PROBLEMS OF SCHAVM'S OUTLINE OF THEORY AXD PROBLEMS OF -v LINEAR ALGEBRA •ed BY SEYMOUR LIPSCHUTZ, Ph.D. Associate Professor of Mathematics Temple University SCHAIJM'S OIJTUl^E SERIES McGRAW-HILL BOOK COMPANY New York, St. Louis, San Francisco, Toronto, Sydney Copyright © 1968 by McGraw-Hill, Inc. All Rights Reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. 37989 8910 8HSH 754321 liX)mOM ^Q^fv^oiA Preface Linear algebra has in recent years become an essential part of the mathematical background required of mathematicians, engineers, physicists and other scientists. This requirement reflects the importance and wide applications of the subject matter. This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all readers regardless of their fields of specialization. More material has been included than can be covered in most first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to stimulate further interest in the subject. Each chapter begins with clear statements of pertinent definitions, principles and theorems together with illustrative and other descriptive material. This is followed by graded sets of solved and supplementary problems. The solved problems serve to illustrate and amplify the theory, bring into sharp focus those fine points without which the student continually feels himself on unsafe ground, and provide the repetition of basic principles so vital to effective learning. Numerous proofs of theorems are included among the solved problems. The supplementary problems serve as a complete review of the material of each chapter. The first three chapters treat of vectors in Euclidean space, linear equations and matrices. These provide the motivation and basic computational tools for the abstract treatment of vector spaces and linear mappings which follow. A chapter on eigen- values and eigenvectors, preceded by determinants, gives conditions for representing a linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms, specifically the triangular, Jordan and rational canonical forms. In the last chapter, on inner product spaces, the spectral theorem for symmetric op- erators is obtained and is applied to the diagonalization of real quadratic forms. For completeness, the appendices include sections on sets and relations, algebraic structures and polynomials over a field. I wish to thank many friends and colleagues, especially Dr. Martin Silverstein and Dr. Hwa Tsang, for invaluable suggestions and critical review of the manuscript. I also want to express my gratitude to Daniel Schaum and Nicola Monti for their very helpful cooperation. Seymour Lipschutz Temple University January, 1968 CONTENTS Chapter 1 VECTORS IN R" AND C" Introduction. Vectors in R«. Vector addition and scalar multiplication. Dot product. Norm and distance in R". Complex numbers. Vectors in C«. Page 1 Chapter 2 LINEAR EQUATIONS 18 Introduction. Linear equation. System of linear equations. Solution of a sys- tem of linear equations. Solution of a homogeneous system of linear equations. Chapter 3 MATRICES 35 Introduction. Matrices. Matrix addition and scalar multiplication. Matrix multiplication. Transpose. Matrices and systems of linear equations. Echelon matrices. Row equivalence and elementary row operations. Square matrices. Algebra of square matrices. Invertible matrices. Block matrices. Chapter 4 VECTOR SPACES AND SUBSPACES Introduction. Examples of vector spaces. Subspaces. Linear combinations, linear spans. Row space of a matrix. Sums and direct sums. 63 Chapter 5 BASIS AND DIMENSION 86 Introduction. Linear dependence. Basis and dimension. Dimension and sub- spaces. Rank of a matrix. Applications to linear equations. Coordinates. Chapter B LINEAR MAPPINGS 121 Mappings. Linear mappings. Kernel and image of a linear mapping. Singular and nonsingular mappings. Linear mappings and systems of linear equations. Operations with linear mappings. Algebra of linear operators. Invertible operators. Chapter 7 MATRICES AND LINEAR OPERATORS 150 Introduction. Matrix representation of a linear operator. Change of basis. Similarity. Matrices and linear mappings. Chapter 8 DETERMINANTS 171 Introduction. Permutations. Determinant. Properties of determinants. Mi- nors and cofactors. Classical adjoint. Applications to linear equations. Deter- minant of a linear operator. Multilinearity and determinants. Chapter 9 EIGENVALUES AND EIGENVECTORS 197 Introduction. Polynomials of matrices and linear operators. Eigenvalues and eigenvectors. Diagonalization and eigenvectors. Characteristio polynomial, Cayley-Hamilton theorem. Minimum polynomial. Characteristic and minimum polynomials of linear operators. CONTENTS Page Chapter 10 CANONICAL FORMS 222 Introduction. Triangular form. Invariance. Invariant direct-sum decom- positions. Primary decomposition. Nilpotent operators, Jordan canonical form. Cyclic subspaces. Rational canonical form. Quotient spaces. Chapter 11 LINEAR FUNCTION ALS AND THE DUAL SPACE 249 Introduction. Linear functionals and the dual space. Dual basis. Second dual space. Annihilators. Transpose of a linear mapping. Chapter 12 BILINEAR, QUADRATIC AND HERMITIAN FORMS 261 Bilinear forms. Bilinear forms and matrices. Alternating bilinear forms. Symmetric bilinear forms, quadratic forms. Real symmetric bilinear forms. Law of inertia. Hermitian forms. Chapter IB INNER PRODUCT SPACES 279 Introduction. Inner product spaces. Cauchy-Schwarz inequality. Orthogo- nality. Orthonormal sets. Gram-Schmidt orthogonalization process. Linear functionals and adjoint operators. Analogy between A(V) and C, special operators. Orthogonal and unitary operators. Orthogonal and unitary mat- rices. Change of orthonormal basis. Positive operators. Diagonalization and canonical forms in Euclidean spaces. Diagonalization and canonical forms in unitary spaces. Spectral theorem. Appendix A SETS AND RELATIONS 315 Sets, elements. Set operations. Product sets. Relations. Equivalence relations. Appendix B ALGEBRAIC STRUCTURES 320 Introduction. Groups. Rings, integral domains and fields. Modules. AppendixC POLYNOMIALS OVER A FIELD 327 Introduction. Ring of polynomials. Notation. Divisibility. Factorization. INDEX 331 chapter 1 Vectors in R^ and C INTRODUCTION In various physical applications there appear certain quantities, such as temperature and speed, which possess only "magnitude". These can be represented by real numbers and are called scalars. On the other hand, there are also quantities, such as force and velocity, which possess both "magnitude" and "direction". These quantities can be represented by arrows (having appropriate lengths and directions and emanating from some given ref- erence point O) and are called vectors. In this chapter we study the properties of such vectors in some detail. We begin by considering the following operations on vectors. (i) (ii) Addition: The resultant u + v of two vectors u and V is obtained by the so-called parallelogram law, i.e. u + V is the diagonal of the parallelogram formed by u and v as shown on the right. Scalar multiplication: The product kn of a real number fc by a vector u is obtained by multiplying the magnitude of u by A; and retaining the same direction if k^O or the opposite direction if k<0, as shown on the right. Now we assume the reader is familiar with the representation of the points in the plane by ordered pairs of real numbers. If the origin of the axes is chosen at the reference point above, then every vector is uniquely determined by the coordinates of its endpoint. The relationship between the above operations and endpoints follows. (i) Addition: If (a, &) and (c, d) are the endpoints of the vectors u and v, then (a + c, b + d) will be the endpoint of u + v, as shown in Fig. (a) below. (a + c, b + d) (ka, kb) Fig. (a) Fig. (6) (ii) Scalar multiplication: If (a, b) is the endpoint of the vector u, then {ka, kb) will be the endpoint of the vector kn, as shown in Fig. (6) above. 2 VECTORS IN B" AND C» [CHAP. 1 Mathematically, we identify a vector with its endpoint; that is, we call the ordered pair (a, 6) of real numbers a vector. In fact, we shall generalize this notion and call an «-tuple {ai, C2, . . . , a«) of real numbers a vector. We shall again generalize and permit the co- ordinates of the «-tuple to be complex numbers and not just real numbers. Furthermore, in Chapter 4, we shall abstract properties of these %-tuples and formally define the mathe- matical system called a vector space. We assume the reader is familiar with the elementary properties of the real number field which we denote by R. VECTORS IN R" The set of all w-tuples of real numbers, denoted by R", is called n-space. A particular «-tuple in R", say U — (til, Uz, . . ., Un) is called a point or vector; the real numbers im are called the components (or: coordinates) of the vector u. Moreover, when discussing the space R" we use the term scalar for the elements of R, i.e. for the real numbers. Example 1.1: Consider the following vectors: (0,1), (1,-3), (1, 2, VS, 4), (-5, -1, 0,ff) The first two vectors have two components and so are points in B^; the last two vectors have four components and so are points in B*. Two vectors u and v are eqtial, written u = v, if they have the same number of com- ponents, i.e. belong to the same space, and if corresponding components are equal. The vectors (1, 2, 3) and (2, 3, 1) are not equal, since corresponding elements are not equal. Example 1.2: Suppose (x-y, x + y, z-1) = (4, 2, 3). Then, by definition of equality of vectors, X — y = 4: x + y = 2 2-1 = 3 Solving the above system of equations gives x = 3, y = —1, and z — A. VECTOR ADDITION AND SCALAR MULTIPLICATION Let u and v be vectors in R": u = (Ml, U2, . . . , Un) and v = {Vi, Vz, ■■ ., Vn) The sum of u and v, written u + v,is the vector obtained by adding corresponding components: U + V = iUi + Vi,U2 + V2, . ■ .,Un + Vn) The product of a real number fc by the vector u, written ku, is the vector obtained by multi- plying each component of u by k: ku — (kui, ku2, . . . , kun) Observe that u + v and ku are also vectors in R". We also define -u = -1m and u- v — m + {-v) The sum of vectors with different numbers of components is not defined. CHAP. 1] VECTORS IN K« AND C" 3 Example 1.3: Let u = (1,-3,2,4) and v = (3,5,-1,-2). Then u + V - (1 + 3, -3 + 5, 2 - 1, 4 - 2) - (4, 2, 1, 2) 5w = (5 • 1, 5 • (-3), 5 • 2, 5 • 4) ^ (5, -15, 10, 20) 2m - 30) = (2, -6, 4, 8) + (-9, -15, 3, 6) = (-7, -21, 7, 14) Example 1.4: The vector (0, 0, . . ., 0) in P.", denoted by 0, is called the zero vector. It is similar to the scalar in that, for any vector u = (ztj, %, . . . , u„), U + = (Ml + 0, M2 + 0, • ■ • , M„ + 0) = (Ml, 2*2 uj = « Basic properties of the vectors in R" under the operations of vector addition and scalar multiplication are described in the following theorem. Theorem 1.1: For any vectors u,v,w G R" and any scalars k, k' G R: (i) (u + v) + w = u + {v + w) (v) k(u + v) - ku + kv (ii) u + — u (vi) (ft + k')u =■ ku + k'u (iii) u + {-u) = (vii) (kk')u = k{k'u) (iv) u + v — V + u (viii) Vu, — u Remark: Suppose u and v are vectors in R" for which u — kv for some nonzero scalar A; e R. Then u is said to be in the same direction as v if fe > 0, and in the op- posite direction if k <0. DOT PRODUCT Let u and v be vectors in R": u = (ui, Ui, . . . , t(„) and v = (vi, Vz, . . ., Vn) The dot or inner product of u and v, denoted hy wv, is the scalar obtained by multiplying corresponding components and adding the resulting products: U'V = UiVi + U^Vi, + • • • + UnVn The vectors u and v are said to be orthogonal (or: perpendicular) if their dot product is zero: m • v = 0. Example 15: Let m = (1, -2, 3, -4), -y = (6, 7, 1, -2) and w = (5, -4, 5, 7). Then u-v = 1-6 + (-2)'7 + 3-1 + (-4)'(-2) = 6-14 + 3 + 8 = 3 M'W = 1'5 + (-2) •(-4) + 3-5 + (-4) -7 = 5 + 8 + 15-28 = Thus u and w are orthogonal. Basic properties of the dot product in R" follow. Theorem 1.2: For any vectors u.v.w G R" and any scalar fc € R: (i) {u + v)"W =^ u-w + vw (iii) wv = vu (ii) {ku) ' V = k{u • v) (iv) u-u^O, and wm = iff u-d Remark: The space R" with the above operations of vector addition, scalar multiplication and dot product is usually called Euclidean n-space. NORM AND DISTANCE IN R» Let u and v be vectors in R": u = (uuUz, .. .,Vm) and v = (vi,V2, . . .,Vn). The dis- tance between the points m and v, written d{u, v), is defined by d(U,V) = \/(«l - '^i? + {U2-V2)^+ ■■■ +(Un- Vn)''' VECTORS IN K" AND C« [CHAP. 1 The norm (or: length) of the vector u, written ||m||, is defined to be the nonnegative square root ot U'u: . \\u\\ = y/u'U = yul + ul+ • • • +ul By Theorem 1.2, wu^O and so the square root exists. Observe that d{u,v) — \\u — v\\ Example 1.6: Let m = (1,-2, 4,1) and v = (3, 1, -5, 0). Then d(u,v) = V(l - 3)2 + (-2 - 1)2 + (4 + 5)2 + (1 - 0)2 = ^95 \\v\\ - V32 + 12 + (-5)2 + 02 = V35 Now if we consider two points, say p — (a, b) and q = (c, d) in the plane R2, then ||p|l = Vo^TF and d{p,q) = V(a - c)" + (& - <i)' That is, ||p|| corresponds to the usual Euclidean length of the arrow from the origin to the point p, and d{p, q) corresponds to the usual Euclidean distance between the points p and q, as shown below: P = (a, b) 1~ i-d| (a, 6) 1 9 = (c, d) I H 1« - e\ — •\ A similar result holds for points on the line R and in space R*. Remark: A vector e is called a unit vector if its norm is 1: ||e|| = 1. Observe that, for any nonzero vector m G R", the vector eu — u/\\u\\ is a unit vector in the same direction as u. We now state a fundamental relationship known as the Cauchy-Schwarz inequality. Theorem 1.3 (Cauchy-Schwarz): For any vectors u,v G R", \u-v\^ \\u\\ \\v\\. Using the above inequality, we can now define the angle 6 between any two nonzero vectors u,v GW by ^ , ^ cos 6 - U V Note that if u-v = 0, then 9 definition of orthogonality. 90° (or: 6 = ir/2). This then agrees with our previous COMPLEX NUMBERS The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair (a, b) of real numbers; equality, addition and multiplication of complex num- bers are defined as follows: (a, b) = (c, d) iff a- c and b = d {a,b) + {c,d) = {a + c,b + d) (a, b)(c, d) = {ac - bd, ad + be) CHAP. 1] VECTORS IN K« AND C" We identify the real number a with the complex number (a, 0): a <-> (a, 0) This is possible since the operations of addition and multiplication of real numbers are preserved under the correspondence: (a, 0) + (b, 0) = (a + b, 0) and (a, 0)(6, 0) = {ab, 0) Thus we view E as a subset of C and replace (a, 0) by o whenever convenient and possible. The complex number (0, 1), denoted by i, has the important property that 4-2 = a = (0, 1)(0, 1) = (-1, 0) = -1 or i = \/=l Furthermore, using the fact (a, 6) = (a, 0) + (0,b) and (0,6) = (b, 0)(0, 1) we have (a, 6) = (a, 0) + (6, 0)(0, 1) = a + bi The notation a + bi is more convenient than (a, b). For example, the sum and product of complex numbers can be obtained by simply using the commutative and distributive laws 9.71(1 7,^ ^ 1* {a + bi) + (c + di) - a + c + bi + di - {a + c) + {b + d)i (a + bi){c + di) = ac + bci + adi + bdi^ - {ac - bd) + {be + ad)i The conjugate of the complex number z = (a,b) = a + bi is denoted and defined by z = a — bi (Notice that zz = a^ + bK) If, in addition, « # 0, then the inverse z-^ of z and division by z are given by z_ a . —b . , w zz Z-1 = a^ + b^ where w GC. We also define — z = —Iz + a' + b^ and = wz Example 1.7: Suppose z ■ and w - z = w + (-z) = 2 + Si and w = 5-2i. Then z + w = (2 + 3i) + (5 - 2t) = 2 + 5 + 3i - 2t = 7 + i zw = (2 + 3i)(5 - 2i) = 10 + 15t - 4i - 6t2 = 16 + Hi 3i z = 2 + Si = 2 w 5 — 2i and w = 5 - 2t = 5 + 2i 2 + 3t (5 - 2i)(2 - 3i) _ 4-19t _ i. _ 31 • (2 + 3i)(2-3t) 13 13 13^ Just as the real numbers can be represented by the points on a line, the complex numbers can be represented by the points in the plane. Specifically, we let the point (a, b) in the plane represent the complex number z = a + bi, i.e. whose real part is a and whose imaginary part is b. The absolute value of z, written |z|, is defined as the distance from z to the origin: \z\ = V^T&^ Note that |z| is equal to the norm of the vector (a, 6). Also, \z Example 1.8: Suppose z-2 + 3i and w = 12 - 5i. Then 1^1 = V4 + 9 = v'iS and |wl ZZ. = V144 + 25 = 13 6 VECTORS IN R« AND C" [CHAP. 1 Remark: In Appendix B we define the algebraic structure called a field. We emphasize that the set C of complex numbers with the above operations of addition and multiplication is a field. VECTORS IN C" The set of all n-tuples of complex numbers, denoted by C", is called complex n-space. Just as in the real case, the elements of C" are called points or vectors, the elements of C are called scalars, and vector addition in C" and scalar multiplication on C" are given by (Zl, Z2, . . ., Zn) + (Wl, W2, . . ., Wn) = (^^l + Wi, Z2 + Wi, . . ., Z„ + W„) Z(2l, 22, . . .,Zn) = {ZZi, 222, . . . , ZZn) where Zi, wi, z G C. Example 1.9: (2 + 3i, 4-i, 3) + (3 -2i, 5i 4 - 6i) = (5 + t, 4 + 4i, 7 - 6t) 2i(2 + 3i, 4 - i, 3) = (-6 + 4i, 2 + 8i, 6i) Now let u and v be arbitrary vectors in C": U — (2i, 22, . . . , Zn), V = {Wi, Wi, . . . , Wn), 2;, Wi G C The dot, or inner, product of u and v is defined as follows: U'V = ZiWl + Z%W% + • • • + ZnWn Note that this definition reduces to the previous one in the real case, since Wi = Wi when Wi is real. The norm of u is defined by ||m|| = yJU'U = \/ZiZi + 2222 + • • • + 2„2„ = V'|2ip + |22p + • • • + |2„|2 Observe that wu and so ||«|| are real and positive when u¥=0, and when u = 0. Example 1.10: Let m = (2 + 3i, 4-i, 2i) and v = {S -2i, 5, 4- 61). Then u-v = (2 + 3i)(3 - 2i) + (4 - iXS) + (2i)(4 - 6i) = (2 + 3i)(3 + 2t) + (4 - 1)(5) + (2i)(4 + 6t) = 13i + 20 - 5t - 12 + 8i = 8 + 16i u'u = (2 + 3t)(2 + 3i) + (4 - i)(4 -i) + (2i)(2t) = (2 + 3i)(2 - 3i) + (4 - i)(4 + i) + (2i)(-2i) = 13 + 17 + 4 = 34 ||m|| - Vu'u = \/34 The space C" with the above operations of vector addition, scalar multiplication and dot product, is called complex Euclidean n-space. Remark: If wv were defined by u-v = ziWi + ••• + ZnWn, then it is possible for U'U-0 even though u¥-0, e.g. if u={l,i,0). In fact, w% may not even be real. CHAP. 1] VECTORS IN R« AND C» 7 Solved Problems VECTORS IN R" 1.1. Compute: (i) (3,-4,5) + (1,1,-2); (ii) (1,2,-3) + (4,-5); (iii) -3(4,-5,-6); (iv) -(-6,7,-8). (i) Add corresponding components: (3, -4, 5) + (1, 1, -2) = (3 + 1,-4 + 1,5-2) = (4, -3, 3). (ii) The stim is not defined since the vectors have different numbers of components. (iii) Multiply each component by the scalar: —3(4, —5, —6) = (—12, 15, 18). (iv) Multiply each component by —1: — (—6, 7, —8) = (6, —7, 8). 1.2. Let u^ (2, -7, 1), v = (-3, 0, 4), w = (0, 5, -8). Find (i) 3% - 4v, (ii) 2u + Zv- 5w. First perform the scalar multiplication and then the vector addition. (i) 3u-4v = 3(2, -7, 1) - 4(-3, 0, 4) = (6, -21, 3) + (12, 0, -16) = (18, -21, -13) (ii) 2u + 3v-5w = 2(2, -7, 1) + 3(-3, 0, 4) - 5(0, 5, -8) = (4, -14, 2) + (-9, 0, 12) + (0, -25, 40) = (4 - 9 + 0, -14 + - 25, 2 + 12 + 40) = (-5, -39, 54) 1.3. Find x and y if {x, 3) = {2,x + y). Since the two vectors are equal, the corresponding components are equal to each other: X = 2, 3 = X + y Substitute x = 2 into the second equation to obtain y = 1. Thus x — 2 and j/ = 1. 1.4. Find x and y if (4, y) = x(2, 3). Multiply by the scalar x to obtain (4, y) = x{2, 3) = (2x, Zx). Set the corresponding components equal to each other: 4 = 2x, y = 3a;. Solve the linear equations for x and y: x = 2 and y = 6. 1.5. Find x, y and z if (2, -3, 4) = x{l, 1, 1) + y{l, 1, 0) + z(l, 0, 0). First multiply by the scalars x, y and z and then add: (2,-3,4) = a;(l, 1, 1) + j/(l, 1, 0) + «(1, 0, 0) = {X, X, x) + (y, y, 0) + (z, 0, 0) = (x-^y + z,x-\-y,x) Now set the corresponding components equal to each other: a; + j/ + z = 2, X -\- y = —3, a; = 4 To solve the system of equations, substitute k = 4 into the second equation to obtain 4 + j/ = — 3 or 3/ = —7. Then substitute into the first equation to find z = 5. Thus x — 4, y = —7, z = 5. 1.6. Prove Theorem 1.1: For any vectors u,v,w GW and any scalars fc, fc'SR, (i) {u + v) + w — u-\- {v + w) (v) k{u + v) — ku + kv (ii) u + — u (vi) (fc + k')u — ku + k'u (iii) u + {-u) - (vii) (kk')u - k{k'u) (iv) u + V = V + u (viii) lu = u Let Wj, Vj and Wj be the ith components of u, v and w, respectively. 8 VECTORS IN R" AND C» [CHAP. 1 (i) By definition, Mj + Vi is the ith component oi u + v and so (itj + Vj) + Wj is the tth component of (u + v) + w. On the other hand, Vi + Wj is the ith component oi v + w and so % + (Vj + Wj) is the tth component of u + (v + w). But Mj, Vj and Wj are real numbers for which the as- sociative law holds, that is, (Ui + Vi) + Wi - Mj-l-(Di + Wj) for i-\,...,n Accordingly, (m + 1)) + tf = u+ {v-^w) since their corresponding components are equal. (ii) Here, = (0, 0, . . ., 0); hence M + = (Ml, M2. • • •. Mn) + (0, 0, . . ., 0) = (Ml + 0, Mg + 0, . . . , M„ + 0) = (Ml, M2, . . . , M„) = M (iii) Since — m = — 1(mi, M2, . . . , m„) = (— Mi, — M2, . . . , — m„), M + (— m) = (Ml, Mg, . . . , M„) + (—Ml, — M2, . . . , — M„) = (Ml - Ml, M2 - M2, . . . , M„ - M„) = (0, 0, . . . , 0) = (iv) By definition, n^ + v^ is the ith component of u + v, and Vj + Mj is the ith. component of v + u. But Mj and ijj are real numbers for which the commutative law holds, that is, Wi + ft = ■"( + Mi, i = 1, . . . , w Hence m + v = 1; + m since their corresponding components are equal. (v) Since Mj + Vj is the ith component of u + v, k(Ui + Vi) is the ith component of k(u + v). Since fcMj and kvi are the ith components of ku and kv respectively, Ajm; + fcvj is the ith component of ku + kv. But k, Mj and v^ are real numbers; hence fc(Mj + Vi) = ftMj + fc^j, i = 1, . . . , n Thus k{u + v) = ku + kv, as corresponding components are equal. (vi) Observe that the first plus sign refers to the addition of the two scalars k and k' whereas the second plus sign refers to the vector addition of the two vectors ku and k'u. By definition, (fe + fc')Mj is the ith component of the vector (k + k')u. Since fcMj and Aj'Mj are the ith components of ku and k'u respectively, kUf + k% is the ith component of ku + k'u. But k, k' and Mj are real numbers; hence (fc + k')Ui — kUi + k'Ui, i = 1, . . . , n Thus (k + k')u = ku + k'u, as corresponding components are equal. (vii) Since fc'Mj is the ith component of k'u, k(k'u^ is the ith component of fe(fe'M). But (fcfc')Mi is the ith component of (kk')u and, since fc, k' and Mj are real numbers, (fcfc')Mj = fc(fc'Mj), i=l, ...,n Hence (kk')u = k(k'u), as corresponding components are equal. (viii) 1 • M = 1(mi, M2, . . . , M„) - (1mi, lu^, .... 1m„) = (mi, M2, . . . , m„) = u. 1.7. Show that Ou = for any vector u, where clearly the first is a scalar and the second a vector. Method 1: Om = 0(mi, M2, . . . , m„) = (Omi, OM2, . . . , Om„) = (0, 0, ...,0) = Method 2: By Theorem 1.1, Om = (0 + 0)m = Om + Om Adding — Om to both sides gives us the required result. DOT PRODUCT 1.8. Compute u • v where: (i) u = (2, -3, 6), v = (8, 2, -3); (ii) u = (1, -8, 0, 5), v = (3, 6, 4); (iii) tt = (3,-5,2,l), v- (4, 1,-2, 5). (i) Multiply corresponding components and add: wv = 2 • 8 + (—3) • 2 + 6 • (—3) = —8. (ii) The dot product is not defined between vectors with different numbers of components. (iii) Multiply corresponding components and add: u-v = 3 • 4 + (—5) • 1 + 2 • (—2) + 1 • 5 = 8. CHAP. 1] VECTORS IN R" AND C" 9 1.9. Determine k so that the vectors u and v are orthogonal where (i) u ^ (1, k, -3) and v = (2, -5, 4) (ii) u = (2, 3fc, -4, 1, 5) and v = (6> -1, 3, 7, 2fc) In each case, compute u • v, set it equal to 0, and solve for k. (i) u'v = 1 • 2 + fc • (-5) + (-3) -4 = 2 - 5fe - 12 = 0, -5k - 10 == 0, fc = -2 (ii) U'V = 2-6 + 3fc'(-l) + (-4)'3 + 1*7 + 5'2fc = 12 - 3fe - 12 + 7 + lO/c = 0, k = -l 1.10. Prove Theorem 1.2: For any vectors u,v,w G R" and any scalar kGK, (i) {u + v)'W = U'W + vw (iii) wv = i; • tt (ii) (fcM)-^ = k{u'v) (iv) M'M - 0, and u-u = iff u = Let M = (Mi,M2 O. •" = (■yi.'y2. • • •'■"n). W = (^1,^2. ■ ■ • . W„). (i) Since u + v = (mi + Vi, M2 + ■"2. •••.**„ + I'm). (u + v)'W = (Ml + Vi)Wi + (% + '"2)^2 + • • • + (U„ + Vn)Wn = UiWi + ViWi + U2W2 + 1'2M'2 + • • ■ + M„W„ + V„W„ = (MiWi + M2W2 + • • • + M„w„) + (viWi + V2W2 + • • • + ■y„w„) = U'W + VW (ii) Since ku = (ku^, ku^, . . ., ku^), (ku) • V = ku^Vi + fcM2'y2 + • • • + ku^V^ = HU^V^ + M2'y2 -I 1- '^n^n) = *=(«* ' ■") (iii) U'V = MjDi + M2''^2 + ■ • • + Mn''^n = '"l"! + ■''2'*2 + ' " ' + 1>n^n = V'U (iv) Since wf is nonnegative for each i, and since the sum of nonnegative real numbers is non- negative, 2 I 2 I J. ,.2 =- n Furthermore, u'U = iff Mj = for each i, that is, iff m = 0. DISTANCE AND NORM IN R» 1.11. Find the distance d{u, v) between the vectors u and v where: (i) u = (1, 7), v = (6, -5); (ii) «=(3,-5,4), t; = (6,2,-1); (iii) m = (5,3,-2,-4,-1), ■?; = (2,-1,0,-7,2). In each case use the formula d(u, v) = VW - v{)^ + •••+(«„ - vj^ . (i) d(u,v) = V(l - 6)2 + (7 + 5)2 = V25 + 144 = \/l69 = 13 (ii) d(u,v) = V(3 - 6)2 + (-5 - 2)2 + (4 + 1)2 = V9 + 49 + 25 = a/83 (iii) d(u,v) = V(5 - 2)2 + (3 + 1)2 + (-2 + 0)2 + (-4 + 7)2 + (-1 - 2)2 = \/47 1.12. Find k such that d{u, v) = 6 where « = (2, fc, 1, -4) and v = (3, -1, 6, -3). (d(u, i;))2 = (2 - 3)2 + (fe + 1)2 + (1 - 6)2 + (-4 + 3)2 = fe2 + 2fe + 28 Now solve fc2 + 2fc + 28 = 62 to obtain fc = 2, -4. 1.13. Find the norm ||m|| of the vector u if (i) u = (2, -7), (ii) u = (3, -12, -4). In each case use the formula ||m|1 = y/u^ + m2 4. . . . + ^2 , (i) IHI = V22 + (-7)2 = V4 + 49 = ^53 (ii) 11^11 = V32 + (-12)2 + (_4)2 = V9 + 144 + 16 = V169 = 13 10 VECTORS IN R" AND C» [CHAP. 1 1.14. Determine & such that ||tt|| = VS^ where u = {l,k,-2,5). I|m|I2 = 12 + fc2 + (-2)2 + 52 = A;2 + 30 Now solve fc2 + 30 = 39 and obtain fc = 3, -3. 1.15. Show that ||m|| ^ 0, and ||m|| = ifi u = 0. By Theorem 1.2, wu — O, and u'u = iff m = 0. Since ||m|| = yjii-u, the result follows. 1.16. Prove Theorem 1.3 (Cauchy-Schwarz): For any vectors u = {u\, . . . , m„) and v — (vi, . . .,Vn) in B", \u-v\ ^ ]\u\\ \\v\\ . n We shall prove the following stronger statement: \u'v\ — 2 |Mt'"tl ^ I|m|| HvH. If M = or V = 0, then the inequality reduces to — — and is therefore true. Hence we need only consider the case in which m # and v j^ 0, i.e. where ||m|| # and ||i;|| # 0. Furthermore, \U'V\ - IMjI)! + • • • + M„V„| ^ \UiVi\ + ••• + \u„vj = 2 kiVil Thus we need only prove the second inequality. Now for any real numbers w, 3/ G R, — (x — j/)2 = x^ — 2xy + y^ or, equivalently, 2xy ^ a;2 + 3/2 (1) Set X = |mj|/||m|| and y = Ifil/HvH in (1) to obtain, for any i, IHI IHI IMP IWP ^^' But, by definition of the norm of a vector, ||m|| = 2m,^ = 2 kiP and ||i;|| = S^f = 2|v,-|2. Thus summing {2) with respect to i and using \u(Vi\ = ImjI |i;j|, we have 2M ^ 2kP 2 It'iP IMI^ IMP IMI IHI IMP IMP IMP IblP 2 ki^il that is, II ,1 ,1 11 - 1 IMI IHI Multiplying both sides by ||m|| H'wH, we obtain the required inequality. 1.17. Prove Minkowski's inequality: For any vectors u-{ui,...,Un) and v = {vi, . . .,Vn) in R", ||m + vH =^ ||tt|| + ||v||. If IIm + vjI = 0, the inequality clearly holds. Thus we need only consider the case ||m + i;1| #0. Now JMj + V(| — jwjl + |i)j| for any real numbers mj, Vj G R. Hence \\u + v\\2 = 2(«i + i'<)=' = 2k + i'iP = 2 ki + vil \ui + Vjj ^ 2 ki + Vil (kil + M) = 2 ki + Vil \ui\ + 2 ki + Vil Ivjj But by the Cauchy-Schwarz inequality (see preceding problem), 2M+fj|KI ^ Ik+^IIIMI and 2k + 'yilkl ^ Ik + HIIHI Thus ||M + f||2 ^ |Im + i;|| IHI + ||m + H| ||i;|l = ||tt + i;|| (IMI + lbll) Dividing by ||m-|- v||, we obtain the required inequality. CHAP. 1] VECTORS IN R» AND C« 11 1.18. Prove that the norm in R" satisfies the following laws: [Ni]: For any vector w, ||m||^0; and |H| = iff u = 0. [Nz]: For any vector M and any scalar A;, \\ku\\ = |/cl ||m||. [Na]: For any vectors u and v, \\u + v\\ ^ \\u\\ + ||t;||. [Ni] was proved in Problem 1.15, and [Ng] in Problem 1.17. Hence we need only prove that [Ni] holds. Suppose u = (ui, ii2, . . ., u„) and so ku = (kui, ku^, .... ku^. Then ||fcMl|2 = (fcMi)2 + (kui)^ + ••• + (fcM„)2 = khi\ + khi\ + ••• + khil The square root of both sides of the equality gives us the required result. COMPLEX NUMBERS 1.19. Simplify: (i) (5 + 3i)(2-7i); (ii) (4-3*^; ("i) g"^'' (i^) |^; (v) »', i*. *" ,-31. (vi) (l + 2i)''; (vii)(2^)' (i) (5 + 3t)(2 - 7i) = 10 + 6i - 35i - 21i? = 31 - 29i (ii) (4 - 3t)2 = 16 - 24t + 9t2 = 7 - 24t r"\ 1 _ (3 + 4t) _ 3 + 4i ^ A + Aj ^*"^ 3 - 4i (3 - 4i)(3 + 4t) 25 25 25 2-7i _ (2 - 7t)(5 - 3t) _ -11 - 41i _ _11_41. ^'^' 5 + 3t ^ (5 + 3i)(5-3t) 34 34 34 (v) ts = i^'i = (-l)t = -i; i* = v^"P = 1; P^ = (i*)7't^ = 1^ • (-t) = -i (vi) (1 + 2i)8 = 1 + 6i + 121* + 8i^ = 1 + 6i - 12 - 8i = -11 - 2i . / 1 Y _ 1 _ (-5 + 12t) _ -5 + 12i _ __5_ , 12 (vu) \^2-3iJ ~ -5-12i ~ (-5 - 12t)(-5 + 12t) 169 169 1.20. Let 2! = 2 - 3i and w = 4 + 5i Find: (i) z + w and zw; (ii) z/w; (iii) « and w; (iv) \z\ and \w\. (i) z + w = 2 - 3i + 4 + 5i = 6 + 2i zw = (2 - 3i)(4 + 5t) = 8 - 12i + lOi - 15t2 = 23 - 2i f\ £. - 2-3t _ (2 - 3i)(4 - 5i) _ -7 - 22i ^ _ 1. _ 22 . 169 ' 169* w 4 + 6i (4 + 5i)(4 - 5t) 41 41 41 (iii) Use a+U = a-bi: « = 2 - 3t = 2 + 3t; w = 4 + 5i = 4 - 5t. (iv) Use |a+6i| = V^^TP: \z\ = |2 - 3t| = V4 + 9 = Vl3; \w\ = [4 + 5i| =: Vl6 + 25= Vil. 1.21. Prove: For any complex numbers z,w G.C, (i) z + w = z + w, (ii) zw = zw, (iii) z = z. Suppose z = a+bi and w = e + di where a, b,c,d& R. (i) z + w = (a+bi) + (e + di} = {a+c) + {b + d)i = {a + e) — {b + d)i = a + c — bi — di = (a — bi) + {e — di) = z + w (ii) zw = (a+ bi)(c + di) = (ae — bd) + (ad+ bc)i = {ac — bd) — (ad + bc)i = (a-bi)(c — di) - zw (iii) z — a+bi — a — bi = a— (—b)i = a+bi = « 12 VECTORS IN R« AND C" [CHAP. 1 1.22. Prove: For any complex numbers z,w GC, \zw\ = \z\ \w\. Suppose z = a + bi and w = c + di where a, b,c,dG R. Then |«|2 = o2 + 62, |w|2 = c2 + d2, and zw = {ac - bd) + {ad + bc)i Thus |zw|2 = (ac - 6d)2 + (ad + 6c)2 = a2c2 - 2abcd + b^d? + aU^ + 2abcd + 62c2 = a^C^ + d2) + 62(c2 + d2) = ((j2 + 62)(c2 + (£2) = |2|2 |^|2 The square root of both sides gives us the desired result. 1.23. Prove: For any complex numbers z,w€lC, \z + w\ — \z\ + \w\. Suppose z = a + bi and w = c + di where a, 6, c, d € R. Consider the vectors u = (a, 6) and V = (c, d) in R2. Note that \z\ = Va2 + 62 = ||m||, jwl = Vc2 + rf2 = |j^|| and \z + w\ = \{a + c) + (b + d)i\ = V(« + c)2 + (6 + d)2 = ||(a+ c, 6 + d)|| = \\u + v\\ By Minkowski's inequality (Problem 1.17), ||m + v|| — ||m|| + \\v\\ and so \z + w\ - \\u + v\\ ^ ||m|| + ||t;|| = |z| + lw| VECTORS IN C" 1.24. Let M = (3 - 2i, 4i, 1 + 6i) and v = {5 + i,2-3i, 5). Find: (i) u + v, (il) 4m, (iii) (1 + i)v, (iv) (1 - 2i)u + (3 + i)v. (i) Add corresponding components: u + v — (8 — t, 2 + i, 6 + 6t). (ii) Multiply each component of u by the scalar 4t: 4m = (8 + 12i, —16, —24 + 4t). (iii) Multiply each component of v by the scalar 1 + i: (1 + t)i; = (5 + 6i + i2, 2 - 1 - 3i2, 6 + 5t) = (4 + 6i, 5 - i, 5 + 5i) (iv) First perform the scalar multiplication and then the vector addition: (1 - 2i)u + (3 + i)v = (-1 - 8i 8 + 4i, 13 + 4i) + (14 + 8i, 9 - Ii, 15 + 5i) = (13, 17 - 3i, 28 + 9t) 1.25. Find u-v and vu where: (i) u = {l — 2i,S + i), v = (4 + 2i, 5 — 6i); (ii) u (3 - 2i, 4i, 1 + 6i), v = (5 + i,2-Si,l + 2i). Recall that the conjugates of the second vector appear in the dot product: (2l, . . ., Z„) • (Wi, ...,WJ = «iWi + • • ■ + 2„W„ i (i) M • v = (1 - 2t)(4 + 2i) + (3 + t)(5 - 6t) = (1 - 2i)(4 - 2i) + (3 + j)(5 + 6t) = -lOi + 9 + 23i = 9 + 13i vu = (4 + 2t)(l - 2i) + (5 - 6i)(3 + 1) = (4 + 2t)(l + 2i) + (5 - 6z)(3 - i) = lOi + 9 - 23i = 9 - 13i (ii) u-v = (3 - 2i)(5 + i) + (4i)(2 - 3i) + (1 + 6i)(7 + 2i) = (3 - 2i)(5 - i) + (4i)(2 + 3i) + (1 + 6i)(7 - 2i) = 20 + 35i ij • tt = (5 + i)(3 - 2i) + (2 - 3t)(4i) + (7 + 2i)(l + 6t) = (5 + i)(3 + 2i) + (2 - 3i)(-4t) + (7 + 2i)(l - 6i) = 20 - 35i In both examples, vu — wv. This holds true in general, as seen in Problem 1.27. CHAP. 1] VECTORS IN R» AND C» 13 1.26. Find ||tt|| where: (i) m = (3 + 4i, 5 - 2i, 1 - 3i); (ii) u={4- i, 2i, 3 + 2t, 1 - hi). Recall that zz — w''+ 6^ when z = a+ hi. Use ||m|P - u-u = ZiZi + Z2Z2 + • • • + z„Zn where s = (z^, Z2 z„) (i) ||m1P = (3)2 + (4)2 + (5)2 + (-2)2 + (1)2 + (-3)2 = 64, or ||m|| == 8 (ii) ||m||2 = 42 + (-1)2 + 22 + 32 + 22 + 12 + (-5)2 = 60, or ||m|| = ^60 = 2\/l5 1.27. Prove: For any vectors u,v EC" and any scalar z GC, (i) wv = vu, (ii) (zu) • v z{u'v), (iii) u-{zv) = z{u'v). (Compare with Theorem 1.2.) Suppose u = {zi, «2, . . . , «„) and v = (wj, W2, • • ■ , w„). (i) Using the properties of the conjugate established in Problem 1.21, VU — WiZi + W2Z2 + ■ • • + W„Z„ = WiZi + W2Z2 + • • • + W„Zn = WiZ^ + W2Z2 + • • • + W„Z„ = Z^Wi + Z2W2 + • • • + z„w„ = W V (ii) Since zu = (zz^, zz2, . . ., zz^), (zu) • V = ZZiWi + ZZ2W2 + • • • + ZZ„Wn = Z(ZiWi + «2M'2 + " " " + ^nWn) = «^(m * '") (iii) Method 1. Since zv = (zwi, zwg, . . . , zwj, U • (zv) = ZiZWl + Z^Wl + • • • + Z„ZW^ = ZiZWi + Z2ZW2 + • • • + «„2W„ = «(«iWi + Z^2 + • • • + Z^W^ = Z(U • V) Method 2. Using (i) and (ii), u • (zv) = (zv) • u = z(vu) = z(vu) = z(u'v) MISCELLANEOUS PROBLEMS 1.28. Let u = (3, -2, 1, 4) and v = (7, 1, -3, 6). Find: (i) u + v; (ii) 4u; (iii) 2u — Sv; (iv) u-v; (v) ||m|| and ||i;||; (vi) d{u,v). (i) u + v - (3 + 7,-2 + 1,1-3,4 + 6) = (10,-1,-2,10) (ii) 4m = (4'3, 4'(-2), 4-1, 4-4) = (12,-8,4,16) (iii) 2m - 3i; = (6, -4, 2, 8) + (-21, -3, 9, -18) = (-15, -7, 11, -10) (iv) u-v = 21-2-3 + 24 = 40 (v) 1|m|1 = V9 + 4 + 1 + 16 = VSO, ||v|| = V'49 + 1 + 9 + 36 = \/95 (vi) d(u,v) = V(3 - 7)2 + (-2 - 1)2 + (1 + 3)2 + (4 - 6)2 = V45 = 3\/5 1.29. Let M = (7 - 2i, 2 + 5i) and v = (1 + i, -3 - 6i). Find: (i) u + v; (ii) 2m; (iii) {S-i)v; (iv) u-v; (v) ||mI| and ||v||. (i) u + v = (7 - 2i + 1 + i, 2 + 5i - 3 - 6z) = (8 - i, -1 - i) (ii) 2m = (14i - 4i2, 4t + 10t2) = (4 + 14i -10 + 4t) (iii) (3-i)v = (3 + 3i - i - 12, -9 - ISi + 3i + 6P) - (4 + 2t, -15 - 15t) (iv) u-v = (7 - 2t)(TTt) + (2 + 5i)(-3 - 6t) = (7 - 2t)(l - 1) + (2 + 5i)(-3 + 6i) = 5 - 9i - 36 - 3i = -31 - 12i (v) IMI = V72 + (-2)2 + 22 + 52 = V^, \\v\\ = Vl'^ + 1^ + (-3)2 + (-6)2 = V47 14 VECTORS IN R« AND C" [CHAP. 1 1.30. Any pair of points P = {ou) and Q = (bi) in R» de- fines the directed line segment from P to Q, written PQ. We identify PQ with the vector v = Q-P: PQ = V = {bi — ai, 62 - a2, . . .,bn- a„) Find the vector v identified with PQ where: (i) P = (2,5), Q = (-3.4) (ii) P=(l,-2,4), Q = (6,0,-3) (i) t) = Q-P = (-3-2, 4-5) = (-5,-1) (ii) V = Q-P = (6 - 1, + 2, -3 - 4) = (6, 2, -7) 1.31. The set H of elements in R" which are solutions of a linear equation in n unknowns Xi, . . .,Xn of the form CiXl + C2X2 + ■ • • + CnXn = & (*) with u = (ci, . . ., c„) ^ in R", is called a kyperplane of R", and (*) is called an equa- tion of H. (We frequently identify H with (*).) Show that the directed line segment PQ of any pair of points P,Q GH is orthogonal to the coefficient vector u; the vector u is said to be normal to the hyperplane H. Suppose P = («!, . . .,aj and Q = (fej, . . .,6„). Then the Oj and the 64 are solutions of the given equation: Cjai + C2a2 + • • ■ + c^an = b, c^bi + C262 + • • • + c„&„ = 6 Let v = PQ = Q-P^ {b,~ai,b2-a^, ...,b„-a„) Then m • t; = Ci(&i - aj) + 62(63 - Og) + • • • + c„(6„ - aj = C161 - citti + C262 - C2a2 + • • • + c„fe„ - c„a„ = (Ci6i + C262 + • • • + c„bj - (CjOi + 02(12 + • • • + c„o„) = 6-6 = Hence v, that is, PQ, is orthogonal to u. 1.32. Find an equation of the hyperplane H in R* if: (i) H passes through P = (3, -2, 1, -4) and is normal to m = (2,5,-6,-2); (ii) H passes through P = (1,-2, 3, 5) and is parallel to the hyperplane H' determined by 4:X — 5y + 2z + w = 11. (i) An equation of H is of the form 2x + 5y — Qz — 2w = k since it is normal to u. Substitute P into this equation to obtain k = —2. Thus an equation of H is 2x + 5y — 6z — 2w = —2. (ii) H and H' are parallel iff corresponding normal vectors are in the same or opposite direction. Hence an equation of H is of the form 4x — 5y + 2z + w = k. Substituting P into this equa- tion, we find k = 25. Thus an equation of H is 4:X — 5y + 2z + w = 25. 1,33. The line I in R" passing through the point P = (a,) and in the direction of u= (Ui) ¥= consists of the points X — P + tu, t GK, that is, consists of the points X = (Xi) obtained from ai -I- Uit (*) Xi a;2 = ^2 -f- U2t n — an ] Unt where t takes on all real values. The variable t is called a parameter, and (*) is called a parametric rep- resentation of I. CHAP. 1] VECTORS IN B," AND C" 15 (i) Find a parametric representation of the line passing through P and in the direc- tion of tt where: (a)P = (2,5) and m = (-3,4); (6) P = (4, -2, 3, 1) and u = (2,5,-7,11). (ii) Find a parametric representation of the line passing through the points P and Q where: (a)P = (7,-2) and Q = (9,3); (6) P = (5, 4, -3) and Q = (l,-3,2). (i) In each case use the formula (*). X = 4 + 2t 'x = 2 - 3t \y = 5 + 4t y = -2+ 5t z = 3 - 7t w = 1 + nt (In K2 we usually eliminate t from the two equations and represent the line by a single equation: Ax + Zy = 23.) (ii) First compute u = PQ = Q — P. Then use the formula (*). (a) u = Q-P = (2,5) (b) u = Q-P = (-4,-7,5) (x = 1 + 2t = -2 + 5« X = 5 -At V = 4 -It z = -3 + 5t (Note that in each case we could also write u — QP — P — Q.) Supplementary Problems VECTORS IN R» 1.34. Let u = (1,-2,5), i; = (3,l,-2). Find: (i) u + v; (ii) -6m; (iii) 2u - 5v; (iv) u-v; (v) ||m|1 and ||t;||; (vi) d(u,v). 1.35. Let w = (2, -1, 0, -3), i; = (1, -1, -i; 3), w = (1,3,-2,2). Find: (i) 2u - 3v; (ii) 5u - 3v - Aw; (iii) —u + 2v — 2w, (iv) u'v,u'W and vw; (v) d(u,v) and d(v,w). 1.36. Let M = (2,1,-3,0,4), t; = (5, -3, -1, 2, 7). Find: (i) u + v; (ii) 3u - 2v; (iii) M'^y; (iv) ||mI| and ll-ull; (v) d(u,v). 1.37. Determine & so that the vectors M and 1) are orthogonal, (i) u = (3,k,—2), t; = (6, — 4, — 3). (ii) u — (5, A;, -4, 2), i; = (1, -3, 2, 2fe). (iii) m = (1, 7, fe + 2, -2), v = (3,k,-S,k). 1.38. Determine a; and 2/ if: (i) (x,x + y) = (y -2,6); (ii) x(l,2) = -4(y,3). 1.39. Determine a; and J/ if : (i) x(3,2) = 2(y,-l); (ii) x(2,y) = y(l,-2). 1.40. Determine a;, y and « if: (i) (3,-1,2) - a:(l, 1, 1) + j/(l, -1, 0) + z(l, 0, 0) (ii) (-1,3,3) = a;(l, 1, 0) + j/(0, 0, -1) + z(0, 1, 1) 1.41. Let «! = (!, 0,0), 62 = (0,1,0), eg = (0,0,1). Show that for any vector u = (a,b,c) in &: (i) M = aei + 6e2 + ceg; (ii) m • ej = a, m • 62 = 6, m • 63 = c. 16 VECTORS IN R« AND C» [CHAP. 1 1.42. Generalize the result in the preceding problem as follows. Let ej e R" be the vector with 1 in the ith coordinate and elsewhere: ei = (1,0,0, ...,0,0), 62 = (0,1,0 0,0), ..., e„ = (0, 0,0, . . .,0, 1) Show that for any vector u — (aj, ag, . . .,«„), (i) u = aiCi + (1262 + • • • + a„e„, (ii) m • ej = Oj for i = 1, . ..,n 1.43. Suppose M e K" has the property that m • 1) = for every v S R". Show that it = 0. 1.44. Using d(u,v) = \\u-v\\ and the norm properties [ATj], [iVj] and [N3] in Problem 1.18, show that the distance function satisfies the following properties for any vectors u,v,w G R": (i) d(u,v)^Q, and d{u, v) = itl u = v; (ii) d(u,v) = d(v,u); (iii) d{u,w) ^ d{u,v) + d(v,w). COMPLEX NUMBERS 1.45. Simplify: (i) (4 - 7t)(9 + 2i); (ii) (3-5i)2; (iii) j^^; (iv)|-3||; (v) (l-i)^. 1.46. Simplify: (i) ^; (ii) f^; (iii) i^s, ^2^ ^^*; (iv) (^3^ .' 1.47. Let z = 2-5i and w = 7 + 3i Find: (i) « + w; (ii) zw; (iii) «/«; (iv) «, w; (v) 1«|, lw| . 1.48. Let z = 2 + i and w = 6 - 5t. Find: (i) z/w; (ii) «, w; (iii) 1«1, |w|. 1.49. Show that: (i) ««-i = 1; (ii) \z\ = \z\; (iii) real part of « = ^(z + z); (iv) imaginary part of « = (z — z)/2i. 1.50. Show that zw = implies z = or w = 0. VECTORS IN C» 1.51. Let It = (1 + 7i, 2 - 6i) and v-{5- 2i, 3 - 4t). Find: (i) u + v; (ii) (3 + i)u; (iii) 2m + (4 - 7t)v; (iv) wwandvw; (v) 1|m1| and ||t)|l. 1.52. Let M = (3 - 7t, 2i, -1 + i) and 1; = (4 - i, 11 + 2i, 8 - Si). Find: (i) m - v; (ii) (3 + i)v; (iii) tfvandvM; (iv) 1|m|| and Hiill . 1.53. Prove: For any vectors u,v,w G C": (i) {u + v)-w = WW + vw; (ii) w (u + v) = wu + wv. (Compare with Theorem 1.2.) 1.54. Prove that the norm in C" satisfies the following laws: [Ni]: For any vector u, \\u\\ ^ 0; and \\u\\ =0 iff m = 0. [A^a]: For any vector u and any complex number z, 1|zm|| = \z\ \\u\\. [Ns]: For any vectors u and v, \\u + v\\ ^ |H| + ||i'l|. (Compare with Problem 1.18.) MISCELLANEOUS PROBLEMS 1.55. Find an equation of the hyperplane in R3 which: (i) passes through (2, —7, 1) and is normal to (3, 1, —11); (ii) contains (1, -2, 2), (0, 1, 3) and (0, 2, -1); (iii) contains (1, -5, 2) and is parallel to 3x - ly + iz = 5. 1.56. Determine the value of k such that 2x - ky + 4z - 5w = 11 is perpendicular to 7x + 2y -z + 2w = 8. (Two hyperplanes are perpendicular iff corresponding normal vectors are orthogonal.) CHAP. 1] VECTORS IN R» AND C 17 1.57. Find a parametric representation of the line which: (i) passes through (7, —1, 8) in the direction of (1, 3, —5) (ii) passes through (1,9,-4,5) and (2,-3,0,4) (iii) passes through (4,-1,9) and is perpendicular to the plane Sx — 2y + z — 18 1.58. Let P, Q and R be the points on the line determined by *i = Oi + Mjt, a;2 = 02 + Mgt, . . . , x^ — a^-\- u„t which correspond respectively to the values ti, ^2 ^^^ *3 ^or t. Show that if tj < (2 < Ht then d(P,Q) ■¥ d(Q,R) = d(P,R). Answers to Supplementary Problems 1.34. (i) u + v = (4, -1, 3); (ii) -6m = (-6, 12, -30); (iii) 2m - 5i; = (-13, -9, 20); (iv) m • -y = -9; (v) ||m|| = V30, ||i;|| = a/U ; (vi) d{u,v) = ^62 1.35. (i) 2u-Zv = (1, 1, 3, -15); (ii) 5u-3v-4w = (3, -14, 11, -32); (iii) -m + 2i; - 2w = (-2,-7, 2, 5); (iv) wv — —6, M • w = —7, vw = 6; (v) d(u, v) = \/38 , d(v, w) = 3'\/2 1.36. (i) M + r = (7, -2, -4, 2, 11); (ii) 3u-2v = (-4, 9, -7, -4, -2); (iii) u-v = 38; (iv) ||m1| = VSO , ||i;||=2V22; (v) d(M,i>) = V^ 1.37. (i) k = 6; (ii) fc = 3; (iii) k = 3/2 1.38. (i) » = 2, J/ = 4; (ii) a; = -6, j/ = 3/2 1.39. (i) x = -l, y - -3/2; (ii) x = 0, y = 0; or w = -2, 2/ = -4 1.40. (i) x = 2, y = S, g = -2; (ii) a; = -1, j/ = 1, » = 4 1.43. We have that m • m = which implies that m = 0. 1.45. (i) 50-55i; (ii) -16 - 30i; (iii) (4 + 7t)/65; (iv) (H-3i)/2; (v) -2 - 2i. 1.46. (i) -^i; (ii) (5 + 27t)/58; (iii) -i, i, -1; (iv) (4 + 3i)/50. 1.47. (i) z + w = 9 - 2t; (ii) zw = 29 - 29i; (iii) z/w = (-1 - 41t)/58; (iv) z = 2 + 5i, w = 7 - 3f, (v) |z| = V29, M = V5^- 1.48. (i) z/w = (7 + 16t)/61; (ii) z = 2 - i, w = 6 + 5t; (iii) \z\ = y/5 , |m;| = Vei . 1.50. If zw = 0, then \zw\ = \z\ \w\ - |0| = 0. Hence \z\ = or lw| =0; and so z = or w = 0. 1.51. (i) M + t) = (6 + 5i, 5 - lOi) (iv) u-v - 21 + 27i, vu = 21 - 27i (ii) (3 + t)u = (-4 + 22t, 12 - 16t) (v) ||m|1 = 3VlO , ||i;|| = 3a/6 (iii) 2iu + (4 - 7i)v = (-8 - 41i, -4 - 33i) 1.52. (i) u-v = (-1 - ei, -11, -9 + 4t) (iii) u-v - 12 + 2i, vu = 12 - 2i (ii) (3 + t)i; = (13 + i, 31 + 17i, 27-i) (iv) ||m|| = 8, JHI = ■v/215 1.55. (i) Sx + y-llz = -12; (ii) 13x + 4j/ + z = 7; (iii) 3» - 7i/ + 4z = 46. 1.56. k-0 1.57. (i) (x = T + t (ii) x = 1 + t (iii) y = 9 - 12* z = -4 + 4t w = 5 — t chapter 2 Linear Equations INTRODUCTION The theory of linear equations plays an important and motivating role in the subject of linear algebra. In fact, many problems in linear algebra are equivalent to studying a system of linear equations, e.g. finding the kernel of a linear mapping and characterizing the subspace spanned by a set of vectors. Thus the techniques introduced in this chapter will be applicable to the more abstract treatment given later. On the other hand, some of the results of the abstract treatment will give us new insights into the structure of "con- crete" systems of linear equations. For simplicity, we assume that all equations in this chapter are over the real field R. We emphasize that the results and techniques also hold for equations over the complex field C or over any arbitrary field K. LINEAR EQUATION By a linear equation over the real field R, we mean an expression of the form aiXi + aix^ + • • • + anXn = h {!) where the ai, & G R and the Xi are indeterminants (or: unknowns or variables). The scalars ttt are called the coefficients of the Xi respectively, and b is called the constant term or simply constant of the equation. A set of values for the unknowns, say Xl = k\, Xi= ki, . . ., Xn= kn is a solution of (i) if the statement obtained by substituting h for Xi, ttifci + a2k2 + • • • + a„fc„ = b is true. This set of values is then said to satisfy the equation. If there is no ambiguity about the position of the unknowns in the equation, then we denote this solution by simply thew-tuple ,, , , , U = (fcl, fe, . . . , kn) Example 2.1 : Consider the equation x + 2y — 4z + w = S. The 4-tuple u - (3, 2, 1, 0) is a solution of the equation since 3 + 2«2-4'l + = 3 or 3 = 3 is a true statement. However, the 4-tuple v = (1, 2, 4, 5) is not a solution of the equation since 1 + 2.2-4-4 + 5 = 3 or -6 = 3 is not a true statement. Solutions of the equation (1) can be easily described and obtained. There are three cases: Case (i): One of the coefficients in (1) is not zero, say ai ¥- 0. Then we can rewrite the equation as follows axx-i = b- a2X2 - • • • — anX„ or Xi = a^^b - a^^aiXi - ■ ■ • — aj" a„x„ 18 CHAP. 2] LINEAR EQUATIONS 19 By arbitrarily assigning values to the unknowns X2, . . .,x„, we obtain a value for Xi; these values form a solution of the equation. Furthermore, every solution of the equation can be obtained in this way. Note in particular that the linear equation in one unknown, ax = h, with a¥'Q has the unique solution x — a^^b. Example 2.2: Consider the equation 2x — 4y + z — 8. We rewrite the equation as 2a; = 8 + 4j/ — 2 or x = A + 2y — ^z Any value for y and z will yield a value for x, and the three values will be a solution of the equation. For example, let 2/ = 3 and z = 2; then x = 4 + 2'3 — ^-2 = 9. In other words, the 3-tuple u = (9, 3, 2) is a solution of the equation. Case (ii): All the coefficients in (1) are zero, but the constant is not zero. That is, the equation is of the form Ojci + 0*2 + • • • + Oa;„ = 6, with 6 # Then the equation has no solution. Case (ill): All the coefficients in (1) are zero, and the constant is also zero. That is, the equation is of the form Oxi + Oa;2 + • • • + Oa;„ = Then every n-tuple of scalars in R is a solution of the equation. SYSTEM OF LINEAR EQUATIONS We now consider a system of m linear equations in the n unknowns Xi, . . ., x„: ttiiXi + ai2a;2 + • • ■ + ainXn = &i a2ia;2 + a22X2 + • • • + a2nXn =62 (*) OmlXl + am2X2 + " • • + OmnXn = &m where the Oij, bi belong to the real field R. The system is said to be homogeneous if the con- stants 61, . . . , 6m are all 0. An ?i-tuple u = (fci, . . .,kn) of real numbers is a solution (or: a particular solution) if it satisfies each of the equations; the set of all such solutions is termed the solution set or the general solution. The system of linear equations + ai2a;2 + • • ■ + ai„Xn = 4- n^aVa 4- ... -I- nn-f- ^ n (**) is called the homogeneous system associated with (*). The above system always has a solu- tion, namely the zero w-tuple = (0, 0, . . . , 0) called the zero or trivial solution. Any other solution, if it exists, is called a nonzero or nontrivial solution. The fundamental relationship between the systems (*) and (**) follows. anXi aziXi + ai2X2 + ■ ■ + a22X2 + • • • ■ + ai„a;„ = ■ • + a2nXn - amlXi + 0^2X2 + • • • + OmnXn - 20 LINEAR EQUATIONS [CHAP. 2 Theorem 2.1: Suppose m is a particular solution of the nonhomogeneous system (*) and suppose W is the general solution of the associated homogeneous system (**). u + W = {u + w: w G W) is the general solution of the nonhomogeneous system (*). We emphasize that the above theorem is of theoretical interest and does not help us to obtain explicit solutions of the system (*). This is done by the usual method of elimination described in the next section. SOLUTION OF A SYSTEM OF LINEAR EQUATIONS Consider the above system (*) of linear equations. We reduce it to a simpler system as follows: Step 1. Interchange equations so that the first unknown xi has a nonzero coeffi- cient in the first equation, that is, so that an ¥-0. Step 2. For each i> 1, apply the operation Li -* —anLi + ttiiLi That is, replace the ith linear equation Li by the equation obtained by mul- tiplying the first equation Li by -an, multiplying the ith equation L, by an, and then adding. We then obtain the following system which (Problem 2.13) is equivalent to (*), i.e. has the same solution set as (*): anaii + ai2X2 + a'laXs + • • • + a'mXn = &i ayja^jg + + 0,2nXn = &2 amJ2^i2 "I" ^" dmnXn = Om where an ¥= 0. Here Xj^ denotes the first unknown with a nonzero coefficient in an equation other than the first; by Step 2, Xj^ ¥^ Xi. This process which eliminates an unknown from succeeding equations is known as (Gauss) elimination. Example 2.3: Consider the following system of linear equations: 2x + iy - z + 2v + 2w - 1 3x + 6y + z — V + iw = —7 4x + 8y + z + 5v — w = 3 We eliminate the unknown x from the second and third equations by applying the following operations: L2 ^ -3Li + 2L2 and L3 -> -2Li + Lg We compute -SLj: -6x - 12y + 3z - 6v - 6w = -3 21,2: ex + 12y + 2z - 2v + Sw - -14 -3Li + 2L2: 5z - 8v + 2w = -17 and —21,1: -4* — 8y + 2z — 4v - 'iw = -2 I/g: 4x + Sy + z + 5v — w = 3 -2Li + L3: 3z + V - 5w = 1 CHAP. 2] LINEAR EQUATIONS 21 Thus the original system has been reduced to the following equivalent system: 2x + iy - z + 2v + 2w = 1 5z - 8v + 2w = -17 32 + V — 5w = 1 Observe that y has also been eliminated from the second and third equations. Here the unknown z plays the role of the unknown Xj above. We note that the above equations, excluding the first, form a subsystem which has fewer equations and fewer unknowns than the original system (*). We also note that: (i) if an equation Oa;i + • • • + Oa;„ = 5, b ¥=0 occurs, then the system is incon- sistent and has no solution; (ii) if an equation Oaji + • • • + Oa;„ = occurs, then the equation can be deleted without affecting the solution. Continuing the above process with each new "smaller" subsystem, we obtain by induction that the system (*) is either inconsistent or is reducible to an equivalent system in the following form aiiXi + ai2X2 + ttisccs + + (iinX„ = bi Cl'2uXj, + a2.j,+ lXi„ + 1 + + a2nXn = &2 , ^ Ciri^Xj^ +ar,j,+ ia;j^+i + • • • + arnX„ = br where 1 < ^2 < • • • < ^r and where the leading coefficients are not zero: an ¥ 0, a2i2 ^ ^' •■ > ^^^r ^ ^ (For notational convenience we use the same symbols an, bk in the system (***) as we used in the system (*), but clearly they may denote different scalars.) Definition: The above system (***) is said to be in echelon form; the unknowns Xi which do not appear at the beginning of any equation {iy^lyjz, . . ., jr) are termed free variables. The following theorem applies. Theorem 2.2: The solution of the system (***) in echelon form is as follows. There are two cases: (i) r — n. That is, there are as many equations as unknowns. Then the system has a unique solution. (ii) r <n. That is, there are fewer equations than unknowns. Then we can arbitrarily assign values to the n — r free variables and obtain a solution of the system. Note in particular that the above theorem implies that the system (***) and any equiv- alent systems are consistent. Thus if the system (*) is consistent and reduces to case (ii) above, then we can assign many different values to the free variables and so obtain many solutions of the system. The following diagram illustrates this situation. 22 LINEAR EQUATIONS fCHAP. 2 System of linear equations Inconsistent Consistent No solution Unique solution More than one solution In view of Theorem 2.1, the unique solution above can only occur when the associated homogeneous system has only the zero solution. Example 2.4: We reduce the following system by applying the operations L2 -* —ZL^ + 2L2 and L3 -» — 3Li + 2L3, and then the operation L3 -* —SL^ + Lgi 2x + y - 2z + 3w = 1 Sx + 2y - z + 2w = 4 3a; + 32/ + 3z - 3w = 5 2x + y - 22 + 3w = 1 2/ + 4z — 5w = 5 Zy + 12z - 15w = 7 2x + y - 2z + 3w = 1 y + Az — 5w = 5 = -8 The equation = -8, that is, Oa; + Oi/ + Oz + Ow = -8, shows that the original system is inconsistent, and so has no solution. Example 2.5: We reduce the following system by applying the operations L^ -> Lg -» — 2Lt + Lg and I/4 -* — 2Li + L4, and then the operations L3 and L4 -» -21/2 + L^: — Lj + L2, -» L2 — Ls X + 2y - Sz X + Sy + z 2x + 5y — 4z 4 11 13 2x + ey + 2z = 22 a; + 2]/ - 32 = y + Az = y + 2z = 2y + 8z 4 7 5 14 X + 2y - Sz = 4 2/ + 4z = 7 2z = 2 = a; + 2?/ - 3z = 4 2/ + 4z = 7 2z = 2 Observe first that the system is consistent since there is no equation of the form = 6, with 6 # 0. Furthermore, since in echelon form there are three equations in the three unknowns, the system has a unique solution. By the third equation, 2 = 1. Substituting z = 1 into the second equation, we obtain y = Z. Substitut- ing z = 1 and y — Z into the first equation, we find x = 1. Thus a; = 1, y = Z and z = 1 or, in other words, the 3-tuple (1, 3, 1), is the unique solution of the system. Example 2.6: We reduce th© "following system by applying the operations L2 L3 -» —5Li + L3, and then the operation L3 -^ — 2L2 + L^: X + 2y 2x + 42/ 5x + lOy 2z + Zw = 2 3z + 4w = 5 8z + llw = 12 X + 2y - 2z + Zw = 2 z - 2w = 1 2z — 4w = 2 x + 2y -2z + Zw = 2 z - 2w = 1 -2Li + L2 and 2y - 2z + Zw = 2 z - 2w = 1 = CHAP. 2] LINEAR EQUATIONS 23 The system is consistent, and since there are more unknowns than equations in echelon form, the system has an infinite number of solutions. In fact, there are two free variables, y and w, and so a particular solution can be obtained by giving y and w any values. For example, let w = 1 and y = —2. Substituting w = 1 into the second equation, we obtain z = 3. Putting w = X, z = 3 and 2/ = — 2 into the first equation, we find a; = 9. Thus a; = 9, y — —2, z = 3 and w = 1 or, in other words, the 4-tuple (9, -2, 3, 1) is a particular solution of the system. Remark: We find the general solution of the system in the above example as follows. Let the free variables be assigned arbitrary values; say, y = a and w = b. Substituting w-h into the second equation, we obtain 2 = 1 + 26. Putting y = a, z = l + 2b and w = h into the first equation, we find a; = 4 - 2a + &. Thus the general solution of the system is a; = 4 — 2a + b, i/ = a, 2 = 1+ 26, w — h or, in other words, (4 — 2a + 6, a, 1 + 26, 6), where a and 6 are arbitrary num- bers. Frequently, the general solution is left in terms of the free variables y and w (instead of a and 6) as follows: X = A — 2y + w, z = 1 + 2w or (4 — 2y + w,y,l + 2w, w) We will investigate further the representation of the general solution of a system of linear equations in a later chapter. Example 2.7: Consider two equations in two unknowns: a^x + 6jj/ = Cj OgX + 622/ = C2 According to our theory, exactly one of the following three cases must occur: (i) The system is inconsistent. (ii) The system is equivalent to two equations in echelon form. (iii) The system is equivalent to one equation in echelon form. When linear equations in two unknowns with real coefficients can be represented as lines in the plane R^, the above cases can be interpreted geometrically as follows: (i) The two lines are parallel. (ii) The two lines intersect in a unique point. (iii) The two lines are coincident. SOLUTION OF A HOMOGENEOUS SYSTEM OF LINEAR EQUATIONS If we begin with a homogeneous system of linear equations, then the system is clearly consistent since, for example, it has the zero solution = (0, 0, . . ., 0). Thus it can always be reduced to an equivalent homogeneous system in echelon form: aiiXi + ai2X2 + aisXs + + amXn = a2i2Xj2 + a2.J2+lXj2+l + + a2nXn = drj^Xj^ 1 ({■r.i^+lXj^+l + • • • + drnXn ^ Hence we have the two possibilities: (i) r = w. Then the system has only the zero solution. (ii) r <n. Then the system has a nonzero solution. If we begin with fewer equations than unknowns then, in echelon form, r <n and hence the system has a nonzero solution. That is, 24 LINEAR EQUATIONS [CHAP. 2 Theorem 2.3: A homogeneous system of linear equations with more unknowns than equations has a nonzero solution. Example 2.8: The homogeneous system X + 2y - 3z+ w = x-Sy + z -2w = 2x + y - Zz + 5w = has a nonzero solution since there are four unknowns but only three equations. Example 2.9: X + y — z = -5j/ + 3z = We reduce the following system to echelon form: X + y — z = X + y — z = 2x-3y + z - -5y + 3z = x-4v + 2z - -5y + 3z = The system has a nonzero solution, since we obtained only two equations in the three unknowns in echelon form. For example, let z = 5; then y = S and x — 2. In other words, the 3-tuple (2, 3, 5) is a particular nonzero solution. Example 2.10: We reduce the following system to echelon form: X + y - z = X + y - z = x + y - z = 2x + 4y - z = 2y + z = 2y + z = Sx + 2y + 2z = -y + 5z = llz = Since in echelon form there are three equations in three unknowns, the system has only the zero solution (0, 0, 0). Solved Problems SOLUTION OF LINEAR EQUATIONS 2x-Sy + 6z + 2v 2.1. Solve the system: y - ^z + v 5w = 3 = 1 . Zw - 2 The system is in echelon form. Since the equations begin with the unknowns x,y and v re- spectively, the other unknowns, z and w, are the free variables. To find the general solution, let, say, z = a and w = 6. Substituting into the third equation, •y - 36 = 2 or -y = 2 + 36 Substituting into the second equation, J/ - 4a + 2 + 36 = 1 01 Substituting into the first equation, 2x - 3(4a - 36 - 1) + 6a + 2(2 + 36) - 56 = 3 Thus the general solution of the system is a; = 3a - 56 - 2, j/ = 4a - 36 - 1, z - y = 4a - 36 - 1 3a - 56 ■u = 2 + 36, w = 6 or (3a-56-2, 4a-36-l, a, 2 + 36, 6), where a and 6 are arbitrary real numbers. Some texts leave the general solution in terms of the free variables z and w instead of a and 6 as follows: CHAP. 2] LINEAR EQUATIONS 25 X = Sz — 5w — 2 y = 4z — 3w — 1 or (3« — 5w — 2, 4z - 3w - 1, z, 2 + 3w, w) V - 2 + 3w After finding the general solution, we can find a particular solution by substituting into the general solution. For example, let a = 2 and 6 = 1; then X = -1, J/ = 4, z = 2, v = 5, w = 1 or (-1, 4, 2, 5, 1) is a particular solution of the given system. X + 2y-3z = -1 2.2. Solve the system: 3a; — y + 2z = 7 . 5x + 3y - 4z = 2 Reduce to echelon form. Eliminate x from the second and third equations by the operations 1,2 -♦ — 3Li + L2 and Lg -* —5Li + L3: -3Li: -3x -6y+9z= 3 -SLj: -5x - lOy + 15z = 5 L2: 3x- y + 2z = 7 L3: 5a: + 3^/ - 4z = 2 -3Li + Lg: -7j/ + llz = 10 -5Li + L3: -7j/ + llz = 7 Thus we obtain the equivalent system X + 2y — 3z = -1 -7y + llz = 10 -7y + llz = 7 The second and third equations show that the system is inconsistent, for if we subtract we obtain Ox + Oy + Oz = S or = 3. 2x+ y-2z = 10 2.3. Solve the system: 3x + 2y + 2z = 1 . 5a; + 42/ + 32 = 4 Reduce to echelon form. Eliminate x from the second and third equations by the operations L2 -* -3Li + 21,2 and ^3 "* -5Li + 2L3: -31,1: -6x -3y + 6z = -30 -SLj: -lOx - 5y + lOz = -50 21,2: 6x + 4j/ + 4z = 2 2L3: 10a; + 83/ + 6z = 8 -3Li + 2L2: J/ + lOz = -28 -5Li + 2L3: 3y + 16z = -42 Thus we obtain the following system from which we eliminate y from the third equation by the operation L^ -» — 3L2 + Lgi 2x + y - 2z = 10 2x + y - 2z - 10 J/ + lOz = -28 to y + lOz = -28 3y + 16z = -42 -142 = 42 In echelon form there are three equations in the three unknowns; hence the system has a unique solution. By the third equation, z = —3. Substituting into the second equation, we find j/ = 2. Substituting into the first equation, we obtain a; = 1. Thus x = l, y = 2 and z = -3, i.e. the 3-tuple (1, 2, —3), is the unique solution of the system. 26 LINEAR EQUATIONS [CHAP. 2 x + 2y-3z = 6 2.4. Solve the system: 2x — y + iz = 2 . 4x + 3y-2z = 14 Reduce the system to echelon form. Eliminate x from the second and third equations by the operations Lz -* —2Li + L^ and L3 -» — 4In + L3: -2Li. -2x - 4j/ + 6« = -12 -4Li: -ix - 8y + 12z = -24 L2- 2x- y+ 4:Z = 2 L3: 4* + 32/ - 2« = 14 -5j/ + 10« = -10 -5y + lOz = -10 or y - 2z - 2 or y - 2z = 2 X + 2y - 3z = 6 y - 2z = 2 Thus the system is equivalent to x + 2y -Sz = 6 y — 2z = 2 or simply y-2z = 2 (Since the second and third equations are identical, we can disregard one of them.) In echelon form there are only two equations in the three unknowns; hence the system has an infinite number of solutions and, in particular, 3 — 2 = 1 free variable which is z. To obtain the general solution let, say, z - a. Substitute into the second equation to obtain J/ = 2 + 2a. Substitute into the first equation to obtain a + 2(2 + 2a) - 3o = 6 or a; = 2 — o. Thus the general solution is a; = 2 - a, y = 2 + 2a, z = a or (2 - a, 2 + 2o, o) where a is any real number. The value, say, o = 1 yields the particular solution a; = 1, 3/ = 4, « = 1 or (1,4, 1). x-Sy + 4:Z — 2w = 5 2.5. Solve the system: 2y + 5z + w = 2 . y-Sz =4 The system is not in echelon form since, for example, y appears as the first unknown in both the second and third equations. However, if we rewrite the system so that w is the second unknown, then we obtain the following system which is in echelon form: a; — 2w — 32/ + 4z = 5 w + 2y + Bz = 2 2/ - 3z = 4 Now if a 4-tuple (a, 6, c, d) is given as a solution, it is not clear if 6 should be substituted for w or for y; hence for theoretical reasons we consider the two systems to be distinct. Of course this does not prohibit us from using the new system to obtain the solution of the original system. Let z = a. Substituting into the third equation, we find y-A + Za. Substituting into the second equation, we obtain w + 2(4 + 3a) + 5a = 2 or w = -6 - 11a. Substituting into the first equation, ^ _ ^^_^ _ ^^^^ _ 3(4 + Sa) + 4a = 5 or a; = 5 - 17o Thus the general solution of the original system is a; = 5 - 17o, J/ = 4 + 3a, z = a, w = -6 - 11a where o is any real number. CHAP. 2] UNEAR EQUATIONS 27 2.6. Determine the values of a so that the following system in unknowns x, y and z has: (i) no solution, (ii) more than one solution, (iii) a unique solution: X + y — z — 1 2x + Zy + az = 3 X + ay + Sz = 2 Reduce the system to echelon form. Eliminate x from the second and third equations by the operations Lj ^ — 2Li + L^ and I/3 -♦ — I/j + Lg: -2Li. -2x -2y + 2z = 2x + 3y+ az - -2 3 —X— y + z = X + ay + Sz = -1 2 y + (a + 2)z = 1 (a-l)y + 4z = 1 Thus the equivalent system is X + y — z = 1 J/ + (a + 2)z = 1 (a-l)y+ 4z = 1 Now eliminate y from the third equation by the operation L3 -^ —(a — V^L^ + L^, -(a - 1)1,2: -(a - 1)2/ + (2 - a - a2)z = 1- a La: (a-l)y + 4z = 1 (6 - a - a^)z - 2- a or (3 + a)(2 - a)z = 2 - a to obtain the equivalent system X -V y — z = 1 J/ + (a + 2)z = 1 (3 + a)(2 - a)z = 2 - a which has a unique solution if the coefficient of z in the third equation is not zero, that is, if a # 2 and a ¥= —3. In case a = 2, the third equation is = and the system has more than one solu- tion. In case a = —3, the third equation is = 5 and the system has no solution. Summarizing, we have: (i) a = —3, (ii) o = 2, (iii) a¥' 2 and o ¥= —3. 2.7. Which condition must be placed on a, b and c so that the following system in unknowns X, y and z has a solution? X + 2y — Sz — a 2x + ey-llz - b X — 2y + Iz — c Reduce to echelon form. Eliminating x from the second and third equation by the operations 1/2 -^ — 2Li + L2 and I/3 -» — Lj + L3, we obtain the equivalent system X + 2y — Sz = a 2y — 5z = b — 2a — 4j/ 4- lOz = c — a Eliminating y from the third equation by the operation Lg -* 2L2 + L3, we finally obtain the equivalent system X + 2y — 3z — a 2y — 5z = b — 2a = c + 26 - 5o 28 LINEAR EQUATIONS [CHAP. 2 The system will have no solution if the third equation is of the form = fe, with k ^ 0; that is, if c + 2b — 5a ¥= 0. Thus the system will have at least one solution if c + 26 - 5a = or 5a = 26 + c Note, in this case, that the system will have more than one solution. In other words, the system cannot have a unique solution. HOMOGENEOUS SYSTEMS OF LINEAR EQUATIONS 2.8. Determine whether each system has a nonzero solution: , o , _ , - n x-2y + Zz-2w = x + 2y-3z = 2x + 5y + 2z = 3x-ly-2z + 4kw ^ 2x + 5y + 2z = x + Ay + 7z = Ax + Sy + 5z + 2w = Sx- y-4z = x+3y + Sz = (i) (ii) ("i) (i) The system must have a nonzero solution since there are more unknowns than equations. (ii) Reduce to echelon form: x + 2y-Zz = x + 2y-Sz = x + 2y - 3z = 2x + 5y + 2z = to y + 8z = to y + 8z = 3x- y-4z = -7y + 5z = 61z = In echelon form there are exactly three equations in the three unknowns; hence the system has a unique solution, the zero solution. (iii) Reduce to echelon form: X + 2y - z = X + 2y - z = 2x + 5y + 2z = y + 4z = x + 2y - z = a; + % + 7z = 2y + 8z = y + Az = x + 3y + Sz = j/ + 4z = In echelon form there are only two equations in the three unknowns; hence the system has a nonzero solution. 2.9. The vectors Ui,...,Um in, say, R" are said to be linearly dependent, or simply dependent, if there exist scalars ki,...,km, not all of them zero, such that kiui + ■■• + kmiim = 0. Otherwise they are said to be independent. Determme whether the vectors u, v and w are dependent or independent where: (i) u = (1, 1, -1), V = (2, -3, 1), w = (8, -7, 1) (ii) u = (1, -2, -3), V = (2, 3, -1), w = (3, 2, 1) (iii) u = (ai, a2), v = (bi, 62), w = (ci, C2) In each case: (a) let XU + yv + zw = where x, y and z are unknown scalars; (6) find the equivalent homogeneous system of equations; (c) determine whether the system has a nonzero solution. If the system does, then the vectors are dependent; if the system does not, then they are independent. (i) Let XU + yv + zw = 0: x{l, 1, -1) + j/(2, -3, 1) + z{8, -7, 1) = (0, 0, 0) or (a;, a;, -x) + (2j/, -3y, y) + (8«, -7z, z) = (0, 0, 0) or (x + 2y + 8z, x-Sy -Iz, -x + y + z) = (0, 0, 0) CHAP. 2] LINEAR EQUATIONS 29 Set corresponding components equal to each other and reduce the system to echelon form: X + 2y + 8z = X + 2y + Sz = x + 2y + Sz = x + 2y + 8z - x-Zy -7z = ~5y - 15z = y + 3z = y + 3z = -X + y + z = Q 3y + 9z = y + Sz = In echelon form there are only two equations in the three unknowns; hence the system has a nonzero solution. Accordingly, the vectors are dependent. Remark: We need not solve the system to determine dependence or independence; we only need to know if a nonzero solution exists. (ii) x(l, -2, -3) + 3/(2, 3, -1) + z{3, 2, 1) = (0, 0, 0) (x, -2x, -3x) + (2y, 3y, -y) + (3z, 2z, z) = (0, 0, 0) (x + 2y + 3z, -2x + 3y + 2z, -3x - y + z) = (0, 0, 0) x + 2y + 3z = x + 2y + 3z - Q x + 2y + 3z = -2x + 3y + 2z = 7y + Sz = 7y + 8z = -3x - y + z = 5y + lOz = 30z = In echelon form there are exactly three equations in the three unknowns; hence the system has only the zero solution. Accordingly, the vectors are independent. (iii) x(,ai, 02) + y{bi, 62) + z(ci, C2) = (0, 0) {ttix, a2x) + {byy, h^y) + (c^z, c^z) = (0, 0) and so "''*' '^ "'^ ~ 02* + 62J/ + C2Z = (dja; + 61J/ + CiZ, a2X + b^y + C2Z) = (0, 0) The system has a nonzero solution by Theorem 2.3, i.e. because there are more unknowns than equations; hence the vectors are dependent. In other words, we have proven that any three vectors in R2 are dependent. 2.10. Suppose in a homogeneous system of linear equations the coefficients of one of the unknowns are all zero. Show that the system has a nonzero solution. Suppose «!, ...,«„ are the unknowns of the system, and Xj is the unknown whose coefficients are all zero. Then each equation of the system is of the form «i^i + • • • + ttj-i^j-i + Oajj + aj + i^j + i + • • • + o„a;„ = Then for example (0, . . .,0, 1, 0, . . .,0), where 1 is the ;th component, is a nonzero solution of each equation and hence of the system. MISCELLANEOUS PROBLEMS 2.11. Prove Theorem 2.1: Suppose m is a particular solution of the homogeneous system (*) and suppose W is the general solution of the associated homogeneous system (**). Then u + W = {u + w: w G W} is the general solution of the nonhomogeneous system (*). Let U denote the general solution of the nonhomogeneous system (*). Suppose uG U and that u = (% Un). Since m is a solution of (*), we have for t = 1, . . . , m, Now suppose w 6 W and that w = (w^, . . ., w„). Since w is a solution of the homogeneous system (**), we have for i = 1, . . .,m, OiiWi + aj2W2 + • ■ • + fflin^^n = 30 LINEAR EQUATIONS [CHAP. 2 Therefore, for i = l, ...,nt, 0,i(Mi + Wi) + Oi2(M2 + W2) + • • • + ai„{Un + W„) = OjiMi + OjiWi + aj2M2 + «t2"'2 + ' " ' + linMn + «inWn = (OilMl + ai2M2 + • • • + «tnO + ("il^l + «i2W2 + ' " ' + «in«'n) = 6i + = 6j That is, M + w is a solution of (*). Thus u + w e^U, and hence u + W c U Now suppose V = (vi, . . . , vj is any arbitrary element of U, i.e. solution of (*). Then, for t = 1, . . .,w, ttji^i + ai2-U2 + • • • + ai„v„ = bj Observe that v = u+(v — u). We claim that v-uGW. For i = 1, . . . , m, ail(i;i — Ml) + ai2(t'2 — M2) + " * ' + »m(^n ~ "n) = (OjlVl + aj2'y2 + • • ■ + ftin^n) ~ («il"l + «t2"2 + • • • + «{„«■„) = 6i - 6i = Thus V - M is a solution of the homogeneous system (*), i.e. ■u-mST^. Then vGu+W, and hence U Q u + W Both inclusion relations give us U - u + W; that is, u+W is the general solution of the nonhomogeneous system (**). 2.12. Consider the system (*) of linear equations (page 18). Multiplying the ith equation by Ci, and adding, we obtain the equation (CiOn + • • • + Cmaml)Xl + • • • + (Cittm + • • • + CmOmn)*™ = Ci5i + • • • + Cmbm (1) Such an equation is termed a linear combination of the equations in (*). Show that any solution of (♦) is also a solution of the linear combination (1). Suppose M = (fci, . . . , fcj is a solution of (*). Then ffliifci + aah + • • • + «{«*:„ = 6i, t = 1, . . .,m (2) To show that tt is a solution of (1), we must verify the equation (Ciaii + • • • + Cmd^Ofcl + • • • + ("ifflln + • • • + emO'mn)K = «!&! + " " " + C^fe^ But this can be rearranged into Ci(anfel + • • • + 0,lnK) + • • • + Cm(aml + • • • + amn'«n) = Ci^l + ' " " + C^fem or, by (2), Ci6i + • • • + cj)„^ = Cjbi + • • • + c^b^ which is clearly a true statement. 2.13. In the system (*) of linear equations, suppose an ¥= 0. Let (#) be the system ob- tained from (*) by the operation U ^ -anLi + auU, i^l. Show that (*) and (#) are equivalent systems, i.e. have the same solution set. In view of the above operation on (*), each equation in (#) is a linear combination of equations in (*); hence by the preceding problem any solution of (*) is also a solution of (#). On the other hand, applying the operation !/{ -* — (-Oii^^i + i-O to (#), we obtain the origi- nal system (*). That is, each equation in (*) is a linear combination of equations in (#); hence each solution of (#) is also a solution of (*). Both conditions show that (*) and (#) have the same solution set. CHAP. 2] LINEAR EQUATIONS 31 2.14. Prove Theorem 2.2: Consider a system in echelon form: aiia;i + ai^Xi + aizXz + + ainXn = bi a2J2^J2 + a2,J2+l«J2 + l + + ainXn = 62 O'ri^^ir + <*r,]V+ia;j^+l + • • • + arnaJn = for where 1<h< ■ ■ • < jr and where an ^ 0, a2J2 ^0, . . . , ari, ^ 0, The solution is as follows. There are two cases: (i) r = n. Then the system has a unique solution. (ii) r <n. Then we can arbitrarily assign values to the n — r free variables and obtain a solution of the system. The proof is by induction on the number r of equations in the system. If r = 1, then we have the single linear equation aiXi + a^Xi + a^x^ + • • • + a„a;„ = 6, where Oj #• The free variables are x^, ■ ■ ., a;„. Let us arbitrarily assign values to the free variables; say, «2 = Iti, xs — k^, ...,«„ = fe„. Substituting into the equation and solving for Xi, Xi = — (6 - fflzfca - asks - ■■■ - o„fc„) "■1 These values constitute a solution of the equation; for, on substituting, we obtain "i — (* "" 02*^2 — • • • — a„k„) + ajt^ + • • • + a„fc„ = 6 or 6 = 6 which is a true statement. Furthermore if r = % = 1, then we have ax = b, where a # 0. Note that x = b/a is a solu- tion since a(b/a) = 5 is true. Moreover if x = k is a solution, i.e. ak = b, then k = b/a. Thus the equation has a unique solution as claimed. Now assume r > 1 and that the theorem is true for a system of r — 1 equations. We view the r — 1 equations '*2J2*J2 + '*2,J2+1*J2 + 1 "^ "^ "2na;n = *2 as a system in the unknowns Xj^ «„. Note that the system is in echelon form. By induction we can arbitrarily assign values to the (n — J2 + 1) — (r — 1) free variables in the reduced system to obtain a solution (say, Xj^ — fcj^, . . . , x„ — &„). As in case r = 1, these values and arbitrary values for the additional J2 — 2 free variables (say, a;2 = ^2, • ■ •, a^j,-! = 'fja-i)' yield a solution of the first equation with Xi = — (6j - 012^2 - • • • - ai„k„) Oil (Note that there are (n — J2 + 1) — (r — 1) + (jg — 2) = n — r free variables.) Furthermore, these values for Xi, . . .,x„ also satisfy the other equations since, in these equations, the coefficients of «!,..., »j„-i are zero. Now if r = n, then 32 = 2. Thus by induction we obtain a unique solution of the subsystem and then a unique solution of the entire system. Accordingly, the theorem is proven. 2.15. A system (*) of linear equations is defined to be consistent if no linear combination of its equations is the equation Oa;i + 0*2 + • • • + Oa;„ = b, where b ¥- (I) Show that the system (*) is consistent if and only if it is reducible to echelon form. Suppose (*) is reducible to echelon form. Then it has a solution which, by Problem 2.12, is a solution of every linear combination of its equations. Since (1) has no solution, it cannot be a linear combination of the equations in (*). That is, (*) is consistent. On the other hand, suppose (*) is not reducible to echelon form. Then, in the reduction process, it must yield an equation of the form (1). That is, (J) is a linear combination of the equations in (*). Accordingly (*) is not consistent, i.e. (*) is inconsistent. 32 LINEAR EQUATIONS [CHAP. 2 Supplementary Problems SOLUTION OF LINEAR EQUATIONS 2x + Sy = 1 5x + 7y = 3 2.16. Solve: (i) (ii) 2x + 4y = 10 3« + 6y = 15 (iii) Ax -2y = 5 -6x + 3y = 1 2.17. Solve: 2x + y - Sz = 5 (i) Sx -2y + 2z = 5 5x -Sy - z = 16 2x + 3y -2z = 5 (ii) « - 21/ + 3« = 2 4a; — 1/ + 4z = 1 X + 2y + 3« = 3 (iii) 2x + 3y+ 8« = 4 3a; + 2j/ + 17? = 1 2.18. Solve: (i) 2x + 3y = 3 x + 2y-3z + 2w - 2 X- 2y - 5 (ii) 2x + 5y - 8z + Gw - 5 3x + 2y = 7 3x + Ay - 5z + 2w = 4 X + 2y - z + 3w = 3 (iii) 2x + Ay + 4z + 3w = 9 3x + 6j/ - 2 + 8w = 10 2.19. Solve: (i) X + 2y + 2z = 2 3x -2y - z = 5 2x - 5y + 3z = -A X + Ay + 6z — X + 5y + Az - 13w = 3 (ii) 3a; - y + 2z + 5w = 2 2x + 2y + 3z - Aw = 1 2.20. Determine the values of fc such that the system in unknowns x, y and z has: (i) a unique solution, (ii) no solution, (iii) more than one solution: kx + y + z = 1 (o) x + ky + z = l (6) X + y + kz = 1 X + 2y + kz = 1 2a; + fc« + 8« = 3 2.21. Determine the values of k such that the system in unknowns x, y and z has: (i) a unique solution, (ii) no solution, (iii) more than one solution: X + y + kz - 2 X - 3z = -3 (a) 3x + Ay + 2z-k (6) 2x + ky - z = -2 2x + 3y — z = 1 X + 2y + kz - 1 2.22. Determine the condition on a, b and c so that the system in unknowns x, y and z has a solution: X + 2y — 3z = a x — 2y + Az = a (i) 3x- y + 2z - b (ii) 2x + Sy - z - b X — 5y + Bz = e 3x + y + 2z = e HOMOGENEOUS SYSTEMS 2.23. Determine whether each system has a nonzero solution: x + 3y -2z = X + 3y -2z = (i) x-8y + 8z - (ii) 2x - 3y + z = 3x-2y + Az = 3x - 2y + 2z = X + 2y — 5z + Aw = (iii) 2x - 3y + 2z + 3w = 4x — 7j/ + z — 6w — CHAP. 2] LINEAR EQUATIONS 33 2.24. Determine whether each system has a nonzero solution: X- 2y + 2z = 2x - 4y + 7z + 4v - 5w - 2x+ y - 2z = 9x + 3y + 2z ^ 7v + w = (i) (ii) 3x+ 4y- 6z = 5x + 2y - 3z + v + 3w - 3x - lly + 12z = 6x - 5y + 4z- 3v — 2w - Q 2.25. Determine whether the vectors u, v and w are dependent or independent (see Problem 2.9) where: (i) u = (1, 3, -1), V = (2, 0, 1), w = (1, -1, 1) (ii) u = (1, 1, -1), V = (2, 1, 0), w = (-1, 1, 2) (iii) u = (1, -2, 3, 1), V = (3, 2, 1, -2), w = (1, 6, -5, -4) MISCELLANEOUS PROBLEMS 2.26. Consider two general linear equations in two unknowns x and y over the real field K: ax + by = e ex + dy = f Show that: (i) it -¥'2, i.e. if ad - 6c ¥= 0, then the system has the unique solution x = ^^ ~ ^f , ad — he _ af — ee ^ ~ ad-bc' (ii) i* 7 = J '^ 7. tlien the system has no solution; (iii) ii — = 2 = -f> then the system has more than one solution. 2.27. Consider the system ax + by = 1 ex + dy = Show that if ad-be¥'0, then the system has the imique solution x = d/(ad — be), y = —e/{ad - be). Also show that if ad—be = 0,e¥'0 or d ^ 0, then the system has no solution. 2.28. Show that an equation of the form Oki + Oa;2 + * • • + Oa;„ = may be added or deleted from a system without affecting the solution set. 2.29. Consider a system of linear equations with the same number of equations as unknowns: fflii*! + ai2«2 + • • • + ai„x„ = 61 a^xi + a22«2 + • • • + a2„x„ = 62 (i) Onl*! + 01.2*2 + • ■ • + «„„«„ = 6„ (i) Suppose the associated homogeneous system has only the zero solution. Show that (i) has a unique solution for every choice of constants 6j. (ii) Suppose the associated homogeneous system has a nonzero solution. Show that there are constants 64 for which {!) does not have a solution. Also show that if {1) has a solution, then it has more than one. 34 LINEAR EQUATIONS [CHAP. 2 2z — 2w a; = 7/2 - 5w/2 - 2j/ Answers to Supplementary Problems 2.16. (i) X = 2, y = -1; (ii) a; = 5 - 2a, j/ = a; (iii) no solution rx = -1 - 7z 2.17. (i) (1,-3,-2); (ii) no solution; (iii) {-1 - 7a, 2 + 2a, a) or |^ ^ g + 2z 2.18. (i) a; = 3, 1/ = -1 ra; = — z + 2w (ii) (-a + 26, 1 + 2a - 26, a, 6) o*" ] ^ + (iii) (7/2 - 56/2 - 2a, a, 1/2 + 6/2, 6) or |^ ^ ^^^ + w/2 2.19. (i) (2, 1, -1); (ii) no solution 2.20. (a) (i) k¥'l and fe # -2; (ii) fc = -2; (iii) fe = 1 (6) (i) never has a unique solution; (ii) k — 4; (iii) fe t^ 4 2.21. (a) (i) fc # 3; (ii) always has a solution; (iii) fe = 3 (6) (i) fc ^ 2 and A; # -5; (ii) fc = -5; (iii) fe = 2 2.22. (i) 2a - 6 + c = 0. (ii) Any values for a, b and c yields a solution. 2.23. (i) yes; (ii) no; (iii) yes, by Theorem 2.3. 2.24. (i) yes; (ii) yes, by Theorem 2.3. 2.25. (i) dependent; (ii) independent; (iii) dependent chapter 3 Matrices INTRODUCTION In working with a system of linear equations, only the coefficients and their respective positions are important. Also, in reducing the system to echelon form, it is essential to keep the equations carefully aligned. Thus these coefficients can be efficiently arranged in a rectangular array called a "matrix". Moreover, certain abstract objects introduced in later chapters, such as "change of basis", "linear operator" and "bilinear form", can also be represented by these rectangular arrays, i.e. matrices. In this chapter, we will study these matrices and certain algebraic operations defined on them. The material introduced here is mainly computational. However, as with linear equations, the abstract treatment presented later on will give us new insight into the structure of these matrices. Unless otherwise stated, all the "entries" in our matrices shall come from some arbitrary, but fixed, field K. (See Appendix B.) The elements of K are called scalars. Nothing essen- tial is lost if the reader assumes that K is the real field R or the complex field C. Lastly, we remark that the elements of R" or C" are conveniently represented by "row vectors" or "column vectors", which are special cases of matrices. MATRICES Let K be an arbitrary field. A rectangular array of the form 0,12 . . . din 0,22 . . . 0,2n \Q,ml Om2 ... fflr where the Odi are scalars in K, is called a matrix over K, or simply a matrix if K is implicit. The above matrix is also denoted by (ohj), i = l, . . .,m, j = 1, . . .,n, or simply by (a«). The m horizontal «-tuples (ail, ai2, . . . , ttln), (tt21, 0^22, . . . , a2n), . . ., {ami, am2, . . . , Omn) are the rows of the matrix, and the n vertical w-tuples lai2X a22 , • • . f \am2l are its columns. Note that the element ay, called the ij-entry or ij-component, appears in the ith row and the yth column. A matrix with m rows and n columns is called an w by « matrix, or m x n matrix; the pair of numbers (m, n) is called its size or shape. 35 36 MATRICES [CHAP. 3 /I -3 4\ Example 3.1: The following is a 2 X 3 matrix: ( r _c, ) • Its rows are (1, —3, 4) and (0, 5, —2); its columns are ( « ) . ( r ) and I j . Matrices will usually be denoted by capital letters A,B, . . ., and the elements of the field K by lower case letters a,b, . . . . Two matrices A and B are equal, written A = B, if they have the same shape and if corresponding elements are equal. Thus the equality of two mxn matrices is equivalent to a system of mn equalities, one for each pair of elements. Example 3.2: The statement ( ^ " '")=(, .) is equivalent to the following system . .. \x-y z-wj VI 4/ of equations: x + y = 3 X — y = I 2z + w = 5 z — w = 4 The solution of the system is x = 2, y = 1, z = 3, w = —1. Remark: A matrix with one row is also referred to as a row vector, and with one column as a column vector. In particular, an element in the field K can be viewed as a 1 X 1 matrix. MATRIX ADDITION AND SCALAR MULTIPLICATION Let A and B be two matrices with the same size, i.e. the same number of rows and of columns, say, mxn matrices: (an ai2 ... ain \ / ^u ^^2 ... bin a21 022 ... CLin I . g _ I ^Hi ^22 ... ban Oml ami . . . ffimn / \ &ml 6m2 . . . &mti The sum of A and B, written A + J?, is the matrix obtained by adding corresponding entries: an + &n ai2 + &12 ... am + bin „ 1 a21 + &2I a22 + 622 ... a2n + ?>2n A + B = ami + bml Omi + &m2 . . • Omn + The product of a scalar k by the matrix A, written fc • A or simply kA, is the matrix obtained by multiplying each entry of A by k: Ckaii kai2 ■ ■ ■ feain fca2i ka22 . . . kazn kaml fcOm2 . • . kOmn I Observe that A+B and kA are also mxn matrices. We also define -A = -1-A and A - B ^ A + {-B) The sum of matrices with different sizes is not defined. CHAP. 3] MATRICES 37 Example 3.3: Let A = (] ^ J\ and B = fj ° ^). Then A + B 3A = 1 + 3-2 + 3 + 2 4-7 5 + 1-6 + 8 3*1 3 • (-2) 3 • 3 3 '4 3-5 3 '(-6) -c 2A-SB = r -' ') + r ° "' '8 10 -12/ V2I -3 -24 4 -2 -3 6 3-6 9 12 15 -18 -7 29 -4 7 -36 Example 3.4: The mXn matrix whose entries are all zero, 10 ... ... ,0 ... 0, is called the zero matrix and will be denoted by 0. It is similar to the scalar in that, for any mXn matrix A = (a^), A + = (a^ + 0) = (Oy) = A. Basic properties of matrices under the operations of matrix addition and scalar multi- plication follow. Theorem 3.1 : Let F be the set of all m x n matrices over a field K. Then for any matrices A,B,C GV and any scalars ki, kz € K, (i) (A+B) + C = A + {B + C) (v) k,{A + B) = kiA + kiB (ii) A + = A (vi) {ki + fe)^ = kiA + k^A (iii) A + (-A) = (vii) (kiki)A = kiik^A) (iv) A+B = B + A (viii) 1- A = A and OA = Using (vi) and (viii) above, we also have that A + A = 2A,A + A + A = ZA, ... . Remark: Suppose vectors in R" are represented by row vectors (or by column vectors); say, u — (ai, Oi, . . . , ttn) and v = (bi, 62, . . . , b„) Then viewed as matrices, the sum u + v and the scalar product ku are as follows: u + v - {ai + bi,a2 + b2,...,an + b„) and ku = (fcai, kaz, ..., A;a„) But this corresponds precisely to the sum and scalar product as defined in Chapter 1. In other words, the above operations on matrices may be viewed as a generalization of the corresponding operations defined in Chapter 1. MATRIX MULTIPLICATION The product of matrices A and B, written AB, is somewhat complicated. For this reason, we include the following introductory remarks. (i) Let A = (Oi) and B = (bi) belong to R", and A represented by a row vector and B by a column vector. Then their dot product A • B may be found by combining the matrices as follows: lbl\ A-B = (tti, 02, . . .,a„) ( M = aibi + a2b2 + ■ ■ • + ttnbn Wl Accordingly, we define the matrix product of a row vector A by a column vector B as above. 38 MATRICES [CHAP. 3 bnXi + biiXi + feisics = y\ (ii) Consider the equations (1) h2lXl + b22X2 + b23X3 = 1/2 This system is equivalent to the matrix equation 6n b. b.s\h\ ^ fyA ^^^.^pjy ^^ ^ ^ &21 &22 &23/U3/ V^V where B — (&«), X = (x,) and Y — (yi), if we combine the matrix B and the column vector X as follows: Dv- _ ( "" "'- ""!( 1 _ /feiiaJi + &i2a;2 + bisa^sN _ fBi'X \b2iXl + b22X2 + b2SXs J \B2-X where Bi and B2 are the rows of B. Note that the product of a matrix and a column vector yields another column vector. auVi + ai22/2 = zi (iii) Now consider the equations (2) a2iyi + (i22y2 = Z2 which we can represent, as above, by the matrix equation ^aii ai2\/yi\ ( Zx , , , , , or simply A Y = Z ^21 022/^2/2/ y22 where A = (Cij), Y = {yi) as above, and Z = (z^. Substituting the values of y\ and 1/2 of (i) into the equations of {2), we obtain aii(&iia;i + 6i2a;2 + b\%x%) + ai2(62ia;i + 622332 + &23a:;3) = «i a2i(&iia;i + &i2a;2 + bisXs) + a22(&2ia;i + &22a;2 + btzx^) = 22 or, on rearranging terms, (ttubii + ai2&2i)a;i + (aii6i2 + ai2&22)a;2 + (an&is + a\2b23)Xz = Zi (3) (azi&u + 022&2i)a;i + («2i&i2 + a22&22)a;2 + (021613 + 022623)033 = 22 On the other hand, using the matrix equation BX = Y and substituting for Y into AY = Z, we obtain the expression ABX = Z This will represent the system (3) if we define the product of A and B as follows: ftii ffli2\/6n 612 bisX _ /aii6ii + 012621 011612 + 012622 011613 + 012623 021 022/1621 622 623/ YO21611 + 022621 021612 + 022622 021613 + O22623 Ai-B' ArB^ Ai-B^ A2-jB' Aa-B^ A2'B^ where Ai and A2 are the rows of A and J?S B^ and B^ are the columns of B. We em- phasize that if these computations are done in general, then the main requirement is that the number of yi in (1) and (2) must be the same. This will then correspond to the fact that the number of columns of the matrix A must equal the number of rows of the matrix B. CHAP. 3] MATRICES 39 With the above introduction, we now formally define matrix multiplication. Definition: Suppose A = (a«) and B = (&«) are matrices such that the number of columns of A is equal to the number of rows of B; say, A is an m x p matrix and B is a pxn matrix. Then the product AB is the mxn matrix whose y-entry is obtained by multiplying the ith row A, of A by the yth column B' of B: That is, jail dml AB = Ai-fii Ai-B2 . . . Ai-5" A2-S1 A2-B2 . . . A2-B" ,A„-Bi Am-B^ where cy = aiiftij + ai2&23 + • • • + avpbp. Opn P 1 fc = l Am'B"! /Cii ... Cm Cii \Cml ... Ci = 2 Cifc&fci- We emphasize that the product AB is not defined if A is an m x p matrix and B is a qxn matrix, where p ^ q. Example 3.5: r s t u <»i (H "3 6i 62 ^3 raj + s6i ra2 + 562 ^'is + S63 toi + m6i (02 + M^2 *"3 + '^^s Example 3.6: 1 2 3 4 1 1 2 1 1 2 1 2 3 4 1-1 + 2-0 I'l + 2-2 3-1 + 4-0 3«l + 4'2 1-1 + 1-3 1-2 + 1'4 0'l + 2-3 0*2 + 2*4 1 5 3 11 4 6 6 8 The above example shows that matrix multiplication is not commutative, i.e. the products AB and BA of matrices need not be equal. Matrix multiplication does, however, satisfy the following properties: Theorem 3.2: (i) iAB)C = A{BC), (associative law) (ii) A{B + C) = AB + AC, (left distributive law) (iii) (B + C)A = BA + CA, (right distributive law) (iv) k{AB) = {kA)B = A{kB), where A; is a scalar We assume that the sums and products in the above theorem are defined. We remark that OA = and BO =^ where is the zero matrix. TRANSPOSE The transpose of a matrix A, written A*, is the matrix obtained by writing the rows of A, in order, as columns: /ttii 0.12 . . . Oln \ ' /ttli 0.21 . . . aTOl\ 0,21 ffi22 . . . 02n \ / O12 ffl22 . . . Om2 ^Oml Om2 . . . Omni \Oin 02n . ■ . OmnJ Observe that if A is an m x « matrix, then A' is an w x m matrix. 40 MATRICES [CHAP. 3 /l 4^ Example 3.7: (J J _IJ = (2 -5^ The transpose operation on matrices satisfies the following properties: Theorem 3.3: (i) (A+B)* = A* + B* (ii) (A')' = A (iii) {kAy — kA\ for k a scalar (iv) {ABy = B«A« MATRICES AND SYSTEMS OF LINEAR EQUATIONS The following system of linear equations anXi + ai2X2 + • • • + aina;n = &i a2iXi + a22X2 + • • • + annXn =62 n\ OmlXi + am2X2 + • • • + OmnXn is equivalent to the matrix equation /an ai2 a2i 022 ... a2n \IX2\ ^ lb2 \ or simply AX = B (2) lOml fflm2 where A = (an), X = {Xi) and B = (&i). That is, every solution of the system {1) is a solution of the matrix equation (2), and vice versa. Observe that the associated homogeneous system of (1) is then equivalent to the matrix equation AX = 0. The above matrix A is called the coefficient matrix of the system (1), and the matrix /(111 O12 . . • ttin tt21 tt22 • • • (^2n ^ttml (lm2 • . . Otnn is called the augmented matrix of (1). Observe that the system (1) is completely determined by its augmented matrix. Example 3.8: The coefficient matrix and the augmented matrix of the system 2a; + 3j/ - 4z = 7 x-2y- 5z - 3 are respectively the following matrices: /2 3 -4\ /2 3-4 7 (1 _2 -5; ^*^ \l -2 -5 3 Observe that the system is equivalent to the matrix equation '2 3 1 -2 X\ ,rj In studying linear equations it is usually simpler to use the language and theory of matrices, as indicated by the following theorems. CHAP. 3] MATRICES 41 Theorem 3.4: Suppose Ui,U2, . . .,tin are solutions of a homogeneous system of linear equations AX = 0. Then every linear combination of the m of the form kiUi + kiUz + • • • + krOin where the fe are scalars, is also a solution of AX = 0. Thus, in particular, every multiple ku of any solution u of AX = is also a solution of AX = 0. Proof. We are given that Aui — 0, Au2 = 0, . . . , Aun = 0. Hence A{kui + kui + • • • + fettn) = kiAui + kiAu^ + • • • + knAun = fciO + ^20 + • • • + fc„0 = Accordingly, kiUi + • • • + k„iia is a solution of the homogeneous system AX = 0. Theorem 3.5: Suppose the field K is infinite (e.g. if K is the real field R or the complex field C). Then the system AX = B has no solution, a unique solution or an infinite number of solutions. Proof. It suffices to show that if AX = B has more than one solution, then it has infinitely many. Suppose u and v are distinct solutions of AX = B; that is, Au = B and Av — B. Then, for any k GK, A{u + k{u-v)) = Au + k{Au-Av) = B + k(B-B) = B In other words, for each k e K, u + k(u-v) is a. solution of AX = B. Since all such solu- tions are distinct (Problem 3,31), AX = B has an infinite number of solutions as claimed. ECHELON MATRICES A matrix A = (an) is an echelon matrix, or is said to be in echelon form, if the number of zeros preceding the first nonzero entry of a row increases row by row until only zero rows remain; that is, if there exist nonzero entries aiii, '^^h' • •' "'^'r' where ^i < ^2 < • • • < jr with the property that aij = for i^r, j < ji, and for i>r We call ttijj, . . . , ttrj,. the distinguished elements of the echelon matrix A. Example 3.9: The following are echelon matrices where the distinguished elements have been circled: /(i) 3 2 4 5 -6\ 1-3 2 2 0/ In particular, an echelon matrix is called a row reduced echelon matrix if the dis- tinguished elements are: (i) the only nonzero entries in their respective columns; (ii) each equal to 1. The third matrix above is an example of a row reduced echelon matrix, the other two are not. Note that the zero matrix 0, for any number of rows or of columns, is also a row reduced echelon matrix. ROW EQUIVALENCE AND ELEMENTARY ROW OPERATIONS A matrix A is said to be row equivalent to a matrix B if B can be obtained from A by a finite sequence of the following operations called elementary row operations: 42 MATRICES [CHAP. 3 [Et]: Interchange the ith row and the yth row: Rt <^ Rj. [E2]: Multiply the ith row by a nonzero scalar k: Ri -» kR,, fc v^ 0. [Es]: Replace the ith row by k times the jth row plus the ith row: Ri -* kRj + R,. In actual practice we apply [£^2] and then [£"3] in one step, i.e. the operation [E]: Replace the ith row by fe' times the jth row plus k (nonzero) times the ith row: Ri -* k'Rj + kRi, k^-O. The reader no doubt recognizes the similarity of the above operations and those used in solving systems of linear equations. In fact, two systems with row equivalent aug- mented matrices have the same solution set (Problem 3.71). The following algorithm is also similar to the one used with linear equations (page 20). Algorithm which row reduces a matrix to echelon form: Step 1. Suppose the ji column is the first column with a nonzero entry. Inter- change the rows so that this nonzero entry appears in the first row, that is, so that ttijj ¥- 0. Step 2. For each i > 1, apply the operation Ri -* —ttij^Ri + aijjiJt Repeat Steps 1 and 2 with the submatrix formed by all the rows excluding the first. Continue the process until the matrix is in echelon form. Remark: The term row reduce shall mean to transform by elementary row operations. Example 3.10: The following matrix A is row reduced to echelon form by applying the operations R2 ^ -2Ri + ^2 and i?3 ^ -3fii + R3, and then the operation R3 -» -SKj + 4^3: 1 2 -3 4 2 5 3 1 2 -3 4 2 2 A=2 4-22to0042to Now suppose A = (oij) is a matrix in echelon form with distinguished elements aijj, . . . , Orj^. Apply the operations Rk -^ -ak^Ri + Oii-Rk, fc = 1, . . ., i- 1 for i = 2, then i = 3, ...,i = r. Thus A is replaced by an echelon matrix whose dis- tinguished elements are the only nonzero entries in their respective columns. Next, multiply Ri by a~^, i~r. Thus, in addition, the distinguished elements are each 1. In other words, the above process row reduces an echelon matrix to one in row reduced echelon form. Example 3.11: On the following echelon matrix A, apply the operation R^ -^ — 4^2 + 3i2i and then the operations fii -♦ ^3 + Bi and R^ -> — 5K3 + 2i22: /2 3 4 5 6\ /6 9 7 -2\ /6 9 7 0^ A=0 3 2 5toO 3 2 5to0 6 4 \0 2/ \0 2/ \0 2/ Next multiply Ri by 1/6, R2 by 1/6 and ^3 by 1/2 to obtain the row reduced echelon matrix /l 3/2 7/6 0\ 12/3 \0 1/ The above remarks show that any arbitrary matrix A is row equivalent to at least one row reduced echelon matrix. In the next chapter we prove, Theorem 4.8, that A is row equivalent to only one such matrix; we call it the row canonical form of A. CHAP. 3] MATRICES 43 SQUARE MATRICES A matrix with the same number of rows as columns is called a square matrix. A square matrix with n rows and n columns is said to be of order n, and is called an n-square matrix. The diagonal (or: main diagonal) of the n-square matrix A = (Oij) consists of the elements an, a22, . . . , ftjin. /l 2 3^ Example 3.12: The following is a 3-square matrix: 4 5 6 \7 8 9, Its diagonal elements are 1, 5 and 9. An upper triangular matrix or simply a triangular matrix is a square matrix whose entries below the main diagonal are all zero: /an ai2 . . ain\ /ail ai2 . . am O22 ain or a22 . a2n \0 ... ann/ Similarly, a lower triangular matrix is a square matrix whose entries above the main diagonal are all zero. A diagonal matrix is a square matrix whose non-diagonal entries are all zero: /a, ... \ a2 ... \o ... a„/ 'ai or ' "-^ an In particular, the n-square matrix with I's on the diagonal and O's elsewhere, denoted by /« or simply /, is called the unit or identity matrix; e.g., /l 0^ h = 10 \0 1, This matrix I is similar to the scalar 1 in that, for any n-square matrix A, AI = lA = A The matrix kl, for a scalar k G K, is called a scalar matrix; it is a diagonal matrix whose diagonal entries are each k. ALGEBRA OF SQUARE MATRICES Recall that not every two matrices can be added or multiplied. However, if we only consider square matrices of some given order n, then this inconvenience disappears. Specif- ically, the operations of addition, multiplication, scalar multiplication, and transpose can be performed on any nxn matrices and the result is again an n x n matrix. In particular, if A is any n-square matrix, we can form powers of A: A^ = AA, A^ = A^A, . .. and A" = / We can also form polynomials in the matrix A: for any polynomial f{x) = ao + ai* + UiX^ + • • • + ttnX" 44 MATRICES [CHAP. 3 where the aj are scalars, we define f(A) to be the matrix /(A) = aol + aiA + a2A^ + • • • + a„A" In the case that f{A) is the zero matrix, then A is called a zero or root of the polynomial f{x). Example 3.13: Let A = (J _l); then A^ = (J J)(J _^2) = (_^ "^ If f(x) = 2a;2 - 3a; + 5, then If g{x) = x^ + 3x- 10, then ''^' = (J ^:) - Ka -I) - < :) ^ c : Thus A is a zero of the polynomial g(x). INVERTIBLE MATRICES A square matrix A is said to be invertible if there exists a matrix B with the property that AB = BA = I where / is the identity matrix. Such a matrix B is unique; for ABi - BiA = / and AB2 = B2A = I implies Bi = BJ - BiiABz) = iBiA)Bi = IB2 = B2 We call such a matrix B the inverse of A and denote it by A~*. Observe that the above relation is symmetric; that is, if B is the inverse of A, then A is the inverse of B. Example 3.14: 2 5\/ 3 -5\ _ /6-5 -10 + 10 1 3/1^-1 2) ~ 1^3-3 -5 + 6 3 -5\/2 5\ _ / 6-5 15-15 -1 2j\l s) " V-2 + 2 -5 + 6 ,'2 5\ / 3 -5\ Thus ( , „ ) and | ^ „ ) are invertible and are inverses of each other. 1 1 1 1 1 3/"""\^-i 2 We show (Problem 3.37) that for square matrices, AB = I if and only if BA = /; hence it is necessary to test only one product to determine whether two given matrices are in- verses, as in the next example. -11 + + 12 2 + 0-2 2 + 0-2 Example 3.15: (2-1 3)(-4 l|=|-22 + 4 + 18 4 + 0-3 4-1-3 -44-4 + 48 8 + 0-8 8 + 1-8 Thus the two matrices are invertible and are inverses of each other. fa b\ We now calculate the inverse of a general 2x2 matrix A — { 1 . We seek scalars X, y, z, w such that ^ '' a ^\( ^ y\ _ /l 0\ fax + bz ay + bw\ _ /l cd)\zwj~\0l) ^^ \cx + dz cy + dwj " \0 1 CHAP. 3] MATRICES 45 which reduces to solving the following two systems of linear equations in two unknowns: iax + bz = 1 jay + bw = \cx + d2 = \cy + dw = 1 If we let |A| = ad — be, then by Problem 2.27, page 33, the above systems have solutions if and only if \A\ ¥= 0; such solutions are unique and are as follows: d d _ —b _ ^ _ — c — z£. — "' ^ ad -be \A\' " ad -be \A\' ad - be \A\' ad -be \A\ d/\A\ -b/\A\\ _ \ ( d -b" Accordingly, .. - , ,i^i i - i^ii '^-c/|A| 0'l\A\J \A\\^-c a Remark: The reader no doubt recognizes \A\ = ad — bc as the determinant of the matrix A; thus we see that a 2 x 2 matrix has an inverse if and only if its determinant is not zero. This relationship, which holds true in general, will be further investigated in Chapter 9 on determinants. BLOCK MATRICES Using a system of horizontal and vertical lines, we can partition a matrix A into smaller matrices called bloeks (or: eells) of A. The matrix A is then called a block matrix. Clearly, a given matrix may be divided into blocks in different ways; for example, 1 -2 3 1 5 4 1 3\ 7-2 = 5 9/ 1', -2 3 5 1 j 3\ 7 |-2 = /I -2 "1 ' 3 2 2 3 5]7 -2 3 \s 1 4 5 1 9/ \3 1 4 1 5 9 The convenience of the partition into blocks is that the result of operations on block matrices can be obtained by carrying out the computation with the blocks, just as if they were the actual elements of the matrices. This is illustrated below. Suppose A is partitioned into blocks; say Ain Ain Multiplying each block by a scalar k, multiplies each element of A by k; thus (kAii. kAi2 . . . kAm feAai &A22 . . . kA2n iCAjnl rCAjn2 . . . iCAmn j Now suppose a matrix B is partitioned into the same number of blocks as A; say B = j Bn B12 . . . Bin B21 B22 . . . B2n \^Bml Bm2 ... B, 46 MATRICES [CHAP. 3 Furthermore, suppose the corresponding blocks of A and B have the same size. Adding these corresponding blocks, adds the corresponding elements of A and B. Accordingly, /All + fill Ai2 + Bi2 ... Am + Em A I I> — I -^21 + -^21 A22 + B22 . . . Aln + Bin \Am\ + Bm\ Am2 + Bm2 • • ■ Amn + Bm The case of matrix multiplication is less obvious but still true. That is, suppose matrices U and V are partitioned into blocks as follows U12 ... C/ip\ /Fa 7i2 ... Vin\ JJ ^ I -Zl C/22 ... U2P ^^^ y ^ I V2I V22 ... V2: Vmi ... Umpj \Vj,l F22 ... Fpn/ such that the number of columns of each block Uik is equal to the number of rows of each block Vkj. Then 'Wn Wn ... Wm W-n W22 ... Wzn where Wa = UnVn + Ui2V2i + • ■ • + UipVpj The proof of the above formula for UV is straightforward, but detailed and lengthy. It is left as a supplementary problem (Problem 3.68). Solved Problems MATRIX ADDITION AND SCALAR MULTIPLICATION 3.1. Compute: -{I 2 -3 4^ -5 1 -1 \ /3 -5 J I2 - 6 -1\ -2 -3y ^^ [1 - -1 1) ^ /3 5\ (iii) -{I 2 -5 -3 6 (i) Add corresponding entries: n 2- \Q -5 I -0 - (I -5 6 - -2 - -^) /I + 3 2-5 -3 + 6 4- - l^ (0 + 2 -5 + 1-2 -1 - -3y 4-333 2 -5 -1 -4 (ii) The sum is not defined since the matrices have different shapes. (iii) Multiply each entry in the matrix by the scalar —3: 0/1 2 -3N _ / -3 -6 9 ^ ' ' - - ' - ' -12 15-18 CHAP. 3] MATRICES 47 «« T . . 72 -5 1\ „ /I -2 -3\ /O 1 -2\ 3^. Let A = (3 0-4)'^ = (0-1 5J'^ = (l-l-lj-F^"<i 3A + 4B-2C. First perform the scalar multiplication, and then the matrix addition: „^ , .„ „^ /6 -15 3\ /4 -8 -12\ / -2 4\ /lO -25 -5\ 3A + 4B - 2C = (^ -12) + (0 -4 2o} + (-2 2 2) = ( 7 -2 lo) „ fxy\ / X 6 \ / 4 X + y 3.3. Fmda;,i/,zandwif 3 = + ^ z w I \ —1 2w \z + w 3 First write each side as a single matrix: /3a; 3y\ _ / x + 4 \3z BwJ ~ \z + w-l X + y + 6 2w + 3 Set corresponding entries equal to each other to obtain the system of four equations, 3as = « + 4 2a; = 4 3y = X + y + 6 2y = 6 + x or 3z = z + w — 1 2z = w — 1 3w = 2w + 3 w = 3 The solution is: x = 2, j/ = 4, z = 1, w = 3. 3.4. Prove Theorem 3.1(v): Let A and B be mxn matrices and k a scalar. Then kiA+B) = kA + kB. Suppose A — (Ojj) and B — (bij). Then Oy + 6jj is the y-entry of A + B, and so &(ajj + 6^) is the v-entry of k(A +B). On the other hand, ka^j and fcfty are the ij-entries of kA and kB respec- tively and so fcay + fe6y is the ti-entry of kA + kB. But k, ay and &„• are scalars in a field; hence k(aij + 6jj) = fcfflij + kbij, for every i, j Thus k(A + B) = kA + kB, as corresponding entries are equal. Remark: Observe the similarity of this proof and the proof of Theorem l.l(v) in Problem 1.6, page 7. In fact, all other sections in the above theorem are proven in the same way as the corresponding sections of Theorem 1.1. MATRIX MULTIPLICATION 3.5. Let (r x s) denote a matrix with shape rxs. Find the shape of the following products if the product is defined: (i) (2x3)(3x4) (iii) (1 x 2)(3 x 1) (v) (3 x 4)(3 x 4) (ii) (4xl)(lx2) (iv) (5 x 2)(2 x 3) (vi) (2 x 2)(2 x 4) Recall that an m X p matrix and a qXn matrix are multipliable only when p = q, and then the product is an m X n matrix. Thus each of the above products is defined if the "inner" numbers are equal, and then the product will have the shape of the "outer" numbers in the given order. (i) The product (ii) The product is a 2 X 4 matrix, is a 4 X 2 matrix. (iv) The product (v) The product (iii) The product is not defined since the inner numbers 2 and 3 are not equal. is a 5 X 3 matrix. is not defined even though the matrices have the same shape. (vi) The product is a 2 X 4 matrix, 48 MATRICES [CHAP. 3 3.6. Let ^ = (2 _!) and ^ ^ ( 3 -2 6 ) " ^^^^ ^^^ ^^' ^"^ ^^' (i) Since A is 2 X 2 and B is 2 X 3, the product AB is defined and is a 2 X 3 matrix. To obtain the /2\ / entries in the first row of AB, multiply the first row (1, 3) of A by the columns _4S x3y'V-2 and ( - 1 of B, respectively: 1 S\/2 0-4\ _ /1-2 + 3-3 1-0 + 3- (-2) 1 • (-4) + 3 • 6 2 -1 j(^ 3 -2 6 y ~ V 2 + 9 0-6 -4 + 18\ /ll -6 14 To obtain the entries in the second row of AB, multiply the second row (2, —1) of A by the columns of B, respectively: 1 3 y 2 -4 N ^ / 11 -6 14 \ 2 -1 A 3 -2 ey ~ V2-2 + (-l)-3 2 • + (-1) • (-2) 2 • (-4) + (-1) • 6/ Thus ^^ = ( 1 2-14 (ii) Note that J5 is 2 X 3 and A is 2 X 2. Since the inner numbers 3 and 2 are not equal, the product BA is not defined. 3.7. Given A = (2,1) and B = /^ 5-3)' *"^ ^^^ ^^' ^"^ ^^* (i) Since A is 1 X 2 and B is 2 X 3, the product AB is defined and is a 1 X 3 matrix, i.e. a row vector with 3 components. To obtain the components of AB, multiply the row of A by each column of B: AB = (%,!)( \ ■"! ® ) = (2 • 1 + 1 . 4, 2 • (-2) + 1 • 5, 2 • + 1 • (-3)) = (6, 1, -3) \ 4 5 — 8 / (ii) Note that B is 2 X 3 and A is 1 X 2. Since the inner numbers 3 and 1 are not equal, the product BA is not defined. 3.8. Given A = ( 1 and B = [^ ^ ^V find (i) AB, (ii) BA. (i) Since A is 3 X 2 and B is 2 X 3, the product AB is defined and is a 3 X 3 matrix. To obtain the first row of AB, multiply the first row of A by each column of B, respectively: 2 -1 \ , /2-3 -4-4 -10 + 0\ 1-1 -8 -10^ \ " ( 3 1 "o ) = -3 To obtain the second row of AB, multiply the second row of A by each column of B, respectively: / 2 -1 \ , / -1 -8 -10 \ /-I -8 -10^ 1-2-5 3 4 \ -6 i / To obtain the third row of AB, multiply the third row of A by each column of B, respectively: 1 1( ^ ^ ^ ] = I 1 + -2 + -5 + I = 11 -2 -5 CHAP. 3] MATRICES 49 "(s -:-:) = -1 -8 -10 1 -2 -5 -3 + 12 6 + 16 15 + i 1 -8 -10 1 -2 -5 9 22 15 Thus AB (ii) Since 5 is 2 X 3 and i4 is 3 X 2, the product BA is defined and is a 2 X 2 matrix. To obtain the first row of BA, multiply the first row of B by each column of A, respectively: 2-2 + 15 -1 + 0-20 15 -21 To obtain the second row of BA, multiply the second row of B by each column of A, respectively: )l ' "« '15 -21 > .10 -2 -5 4 15 -21 6 + 4 + -3 + + /15 -21 (lO -3 Thus BA "-1) Remark: Observe that in this case both AB and BA are defined, but they are not equal; in fact they do not even have the same shape. 3.9. Let A = /I -4 1 2 -1 o\ and B = 2 -1 3 -1 1 \4 -2 0, (i) Determine the ahaite of AB. (ii) Let Ca denote the element in the ith row and :;th column of the product matrix AJB, that is, AB = (co). Find: c^a, Cu and C21. (i) Since A is 2 X 3 and £ is 3 X 4, the product AB is a 2 X 4 matrix. (ii) Now Cy is defined as the product of the ith row of A by the ith column of B. Hence: = 1 • + • 3 + (-3) • (-2) = + + 6 = 6 c,4 - (2,-1,0) 2-1 + (-l)'(-l) + 0-0 = 2 + 1 + = C21 = (1, 0, -3) = 1 • 1 + • 2 + (-3) -4 = 1 + 0-12 = -11 3.10. Compute: (i) (ii) 1 6\/4 -3 5/(2 -1 1 6 -3 5 2 -7 (iii) (iv) -^(-l (3,2) (V) (2,-1) -6 (i) The first factor is 2 X 2 and the second is 2 X 2, so the product is defined and is a 2 X 2 matrix: 50 MATRICES [CHAP. 3 1 6Y4 0\ ^ / l'4 + 6-2 l-0 + 6'(-l) \ _ / 16 -6' ^-3 5A2 -1/ V(-3)-4 + 5.2 (-3)-0 + 5-(-l)y ~ [-2 -5^ (ii) The first factor is 2 X 2 and the second is 2 X 1, so the product is defined and is a 2 X 1 matrix: 1 ^V 2\ _ / 1-2 + 6- (-7) \ _ /-40' -3 5A-7; \(-3)'2 + 5'(-7)) ^ [-41^ (iii) Now the first factor is 2 X 1 and the second is 2X2. Since the inner numbers 1 and 2 are distinct, the product is not defined. (iv) Here the first factor is 2 X 1 and the second is 1 X 2, so the product is defined and is a 2 X 2 matrix: 3 2^ {>'' = il--l i:i) = 18 12 (v) The first factor is 1 X 2 and the second is 2 X 1, so the product is defined and is a 1 X 1 matrix which we frequently write as a scalar. (2,-l)(^_g) = (2-1 + (-1). (-6)) = (8) = 8 3.11. Prove Theorem 3.2(i): (AB)C = A{BC). Let A = (oy), B = (bfl,) and C = (e^). Furthermore, let AB = S = (sj^) and BC = T = (t,,). Then Sjfc = ajiftifc + at2b2k + • • • + at„6mfc = 2 Oyftjj. 3=1 n hi = ^ji'^ii + bjiCn + • • • + bj„c„i = 2 fcjfcCfci lc=l Now multiplying S by C, i.e. (AB) by C, the element in the ith row and Ith column of the matrix {AB)C is n m »ilCll + SJ2C21 + • • • + Si„C„i = 2 StfcCfcl =22 {"'ifijkiOkl k=l fc=l j=l On the other hand, multiplying A by T, i.e. A by BC, the element in the tth row and fth column of the matrix A{BC) is tn m n »il*ll + «i2*2! + • • • + aim*ml = 2 »ij*jl =22 ««(6jfcCfci) Since the above sums are equal, the theorem is proven. 3.12. Prove Theorem 3.2(ii): A{B + C) = AB + AC. Let A = (tty), B = (6jfc) and C = (Cj^). Furthermore, let D = B + C= (dj^), E = AB = (ej^) and F = AC = (fik)- Then djk = 6jfc + Cjfc m e* = aii6ik + ai2*'2fc + • ■ • + ajm^mk - 2 «ij6jic j=i m /ifc = Ojl^lfc + <»i2<'2fc + ■ • • + "■vm^mk — 2 "ijCjfc 3 = 1 Hence the element in the ith row and feth column of the matrix AB + AC is m tn m «ik + fik - 2 ayftjfc + 2 ttyCj-fc = 2 «i)(6jic + c^k) i=l 3=1 j=l On the other hand, the element in the ith row and fcth column of the matrix AD = A(B + C) is m m Ojidifc + aisd^k + ••• + otmdmk = 2 o-ad-jk = 2 a.ij(bjk + Cjk) }=l i=l Thus A{B + C) — AB + AC since the corresponding elements are equal. CHAP. 3] MATRICES 51 TRANSPOSE 3.13. Find the transpose A* of the matrix A = Rewrite the rows of A as the columns of A': 3.14. Let A be an arbitrary matrix. Under what conditions is the product AA* defined? Suppose A is an m X n matrix; then A* is n X m. Thus the product AA* is always defined. Observe that A*A is also defined. Here AA* is an w X m matrix, whereas A*A is an n X n matrix. 3.15. Let ^ = ( 3 _! 4) • Find (i) AA\ (ii) A*A. '1 31 To obtain A*, rewrite the rows of A as columns: A* = 2 — 1 | . Then /I 3- ^^ - - G -' :)(: -: l'l + 2'2 + 0'0 1-3 + 2'(-l) + 0'4 \ _ /5 1 3'1 + (-1)'2 + 4-0 3'3 + (-1) •(-!) + 4-4/ ~ \1 26 A*A - I -i (3 -I :) 1-1 + 3-3 l'2 + 3'(-l) l'0 + 3-4 \ 2 • 1 + (-1) '3 2 • 2 + (-1) • (-1) 2 • + (-1) • 4 0«l + 4'3 0-2 + 4' (-1) • + 4 • 4 j 3.16. Prove Theorem 8.3(iv): {AB)* = B*AK Let A = (oy) and B = (bj^). Then the element in the ith row and jth column of the matrix AB is anbij + ai^hzj + • • • + ai^h^j (1) Thus (1) is the element which appears in the jth row and ith column of the transpose matrix (AB)*. On the other hand, the jth row of B* consists of the elements from the jth column of B: (6„ bzj ... 6„j) (2) Furthermore, the tth column of A* consists of the elements from the ith row of A: (3) Consequently, the element appearing in the ;th row and ith column of the matrix B*A* is the product of (2) by (S) which gives (1). Thus (AB)* = B*A*. 52 MATRICES [CHAP. 3 ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS 3.17. Circle the distinguished elements in each of the following echelon matrices. Which are row reduced echelon matrices? /l 2 -3 l\ /O 1 7 -5 0\ 5 2-4, 1, \0 7 3/ \0 0/ The distinguished elements are the first nonzero entries in the rows; hence [l)2-3 0l\ /0©7-5 0\ /0O5O2^ 0(1)2 -4, 00,00 20 4 3/ \0 0/ \0 7, An echelon matrix is row reduced if its distinguished elements are each 1 and are the only nonzero entries in their respective columns. Thus the second and third matrices are row reduced, but the first is not. /I -2 3 -1\ 3.18. Given A = 2 -1 2 2 . (i) Reduce A to echelon form, (ii) Reduce A to row \3 1 2 3/ canonical form, i.e. to row reduced echelon form. (i) Apply the operations ^2 -* -2i?i + Rz and R^ ^ -ZRi + ^3. and then the operation iJj -> -7B2 + 3B3 to reduce A to echelon form: '1 -2 3 -1^ Ato|0 3-4 4|to|0 3-4 4 ^0 7 -lOy ii) Method 1. Apply the operation B, -» 2i?2 + ZRi, and then the operations iSj -^ -^3 + 7i?i and Ri ^ 4/?3 + 7i?2 to the last matrix in (i) to further reduce A: IZX 45^ to I 3-4 4 I to I 21 0-12 7 -10 J Finally, multiply Bj by 1/21, R^ by 1/21 and fig by 1/7 to obtain the row canonical form of A: '1 15/7^ 10 -4/7 ^0 1 -10/7 J Method 2. In the last matrix in (i), multiply R^ by 1/3 and fig by 1/7 to obtain an echelon matrix where the distinguished elements are each 1: '1-2 3 -1 1 -4/3 4/3 ^0 1 -10/7/ Now apply the operation R^ -» 2^2 + Ru and then the operations R2, -» (4/3)fi3 + R^ and jBj -^ {—1/3)R3 + jBi to obtain the above row canonical form of A. Remark: Observe that one advantage of the first method is that fractions did not appear until the very last step. CHAP. 3] MATRICES 53 /O 1 3 -2\ 3.19. Determine the row canonical form of A = 2 1-4 3 \2 3 2 -ll 1^ 1 -4 Z\ I2 1 A to Note that the third matrix is already in echelon form. / 6 3 -4\ 3.20. Reduce A = -4 1 -6 to echelon form, and then to row reduced echelon form, \ 1 2-5/ i.e. to its row canonical form. The computations are usually simpler if the "pivotal" element is 1. Hence first interchange the first and third rows: A to Note that the third matrix is already in echelon form. 3.21. Show that each of the following elementary row operations has an inverse operation of the same type. m Interchange the ith row and the jth. row: Ri <^ Rj. Multiply the zth row by a nonzero scalar k: Ri -» kRi, fc ^ 0. Replace the ith row by k times the jth row plus the ith row: Ri ^ kRj + Ru (i) Interchanging the same two rows twice, we obtain the original matrix; that is, this operation is its own inverse. (ii) Multiplying the ith row by k and then by fc-i, or by fc-i and then by k, we obtain the original matrix. In other words, the operations iJj -» kRi and i?j -^ fe-iiJj are inverses. (iii) Applying the operation flj -» kRj + Ri and then the operation fij -^ —kRj + fij, or apply- ing the operation fij -* -kRj + i?j and then the operation fij -» kRj + flj, we obtain the orig- inal matrix. In other words, the operations Ri -» kRj + fij and iJj -» —kRj + flj are inverses. SQUARE MATRICES 3.22. Let A = ( ^ _g ) . Find (i) A^ (ii) A*, (iii) /(A), where fix) = 2a^ - 4x + 5. 1 2 4 -3 ^4 -3^ (i) A^ = AA = )il 4) / 1-1 + 2-4 1-2 + 2 -(-3) \ ^ / 9 -4\ V4-l + (-3)-4 4-2 + (-3) -(-3)/ ~ [-8 17/ 54 MATRICES [CHAP. 3 « - = -' = c -:)(-: ») / l-9 + 2-(-8) l-(-4) + 2-17 \ ^ /-7 30\ (^4-9 + (-3) -(-8) 4 -(-4) + (-3) -17/ \eO -67 J (iii) To find /(A), first substitute A for x and 57 for the constant 5 in the given polynomial f(x) = 2x9 _ 4a; + 5: /(A) = 2A3-4A + 5/ = 2(-; -Z) - ^ {\ -s) + K'o l) Then multiply each matrix by its respective scalar: /-14 60\ / -4 -8\ /5 0\ 1^120 -134y V-16 12/ \0 5y Lastly, add the corresponding elements in the matrices: _ / -14 -4 + 5 60-8 + \ _ I' -IS " 1^120-16 + -134 + 12 + 5/ 104 52 \ -117/ 3.23. Referring to Problem 3.22, show that A is a zero of the polynomial g{x) = a;^ + 2a! - 11. A is a zero of g(x) if the matrix g(A) is the zero matrix. Compute g(A) as was done for /(A), i.e. first substitute A for x and 11/ for the constant 11 in g(x) = x^ + ^x- 11: „,., = ..... -n, = (-:.)-(! 4) -"GO Then multiply each matrix by the scalar preceding it: / 9 _4x /2 4\ , /-ll g{A) ^ / 9 -4X /2 < V-8 17; V8 -( 6/ V -11 Lastly, add the corresponding elements in the matrices: /9 + 2-11 -4 + 4 + 0\ ^ /O g{A) - ^_g^8^.Q i7_g_;^iy (^0 Since g{A) = 0, A is a zero of the polynomial g(x). 3.24. Given A - [ ) . Find a nonzero column vector u = I ) such that Au - 3m. First set up the matrix equation Au = 3u: 4 -3/U/ ~ ^\y Write each side as a single matrix (column vector): / x + 3y\ ^ /Sx^ \Ax-3yJ V3j/y Set corresponding elements equal to each other to obtain the system of equations (and reduce to echelon form): a; + 3J, = 3a; __ 2x - 3y = _ 2x - Sy = ^^ 2x - Sy = Ax-Zy - Zy Ax- 6y = 0-0 The system reduces to one homogeneous equation in two unknowns, and so has an infinite number of solutions. To obtain a nonzero solution let, say, ?/ = 2; then » = 3. That is, a; = 3, i/ = 2 is a solution of the system. Thus the vector w = ( g j is nonzero and has the property that Au = 3m. CHAP. 3] MATRICES 55 /3 5^ 3.25. Find the inverse of f ^ „ l2 3 Method 1. We seek scalars x, y, z and w for which ,2 3/\2 w/ " \0 l) *"" \2x + Sz 2y + 3w) ,.,..» r3a; + 5« = 1 [31/ + 5w = or which satisfy < and ■( l2a; + 3« = \2j/ + 3w = 1 GO The solution of the first system is a; = —3, z — 2, and of the second system is 2/ = 5, w = —3. /-3 5\ Thus the inverse of the given matrix is 1 ) . Method 2. We derived the general formula for the inverse A-^ oi the 2X2 matrix A Ttaslf A=Q ly ,h.„ |A| = 9-1(1 = -1 .„d A-. = -l(4 -^) = (-J 4). A^i = rTT ( 1 where lAI = ad — 6c 1^1 V-c a' MISCELLANEOUS PROBLEMS 3.26, Compute AB using block multiplication, where 2 I 1\ 3 4 I 1 and B = \0 I 2, Hence „ ^1 and S = ( ) where E, F, G, R, S and T are the given blocks. GJ \0 TJ ER ES + FT\ GT J ~ //9 12 15N /3N /I Vl9 26 33/ V^/ \0 \ ( 0) (2) AB = (^^^ ^^^+/^) = jVl9 26 33y V^yVoyj = ji9 26 33 7 3.27. Suppose B = {Ri, R2, . . ., i?„), i.e. that Ri is the ith row of B. Suppose BA is de- fined. Show that BA = (RiA, RzA, . . .,RnA), i.e. that RiA is the ith row of BA. Let i4i, A2, . . ., A"» denote the columns of A. By definition of matrix multiplication, the ith row of BA is {Ri •A\Ri'A\ • . . , i2i • A"). But by matrix multiplication, BjA = (Bj • A^, i?i • A2, . . . , Bj-A™). Thus the ith row of BA is ftjA. 3.28. Let Ci = (0, . . . , 1, . . . , 0) be the row vector with 1 in the tth position and else- where. Show that CjA = Ri, the ith row of A. Observe that Cj is the ith row of /, the identity matrix. By the preceding problem, the tth row of I A is BjA. But lA = A. Accordingly, CjA = JBj, the ith row of A. 3.29. Show: (i) If A has a zero row, then AB has a zero row. (ii) If B has a zero column, then AB has a zero column. (iii) Any matrix with a zero row or a zero column is not invertible. (i) Let jBj be the zero row of A, and B^, . . .,£" the columns of B. Then the ith row of AB is {RrB\ Ri'B^ ..., Ri-B^) = (0, 0, ..., 0) 56 MATRICES [CHAP. 3 (ii) Let Cj be the zero column of B, and Aj, . . ., A„ the rows of A. Then the jth column of AB is /Ai-C/ A^'Cj m'Cj (iii) A matrix A is invertible means that there exists a matrix A~^ such that AA"^ = A~^A —I. But the identity matrix / has no zero row or zero column; hence by (i) and (ii) A cannot have a zero row or a zero column. In other words, a matrix with a zero row or a zero column cannot be invertible. 3.30. Let A and B be invertible matrices (of the same order). Show that the product AB is also invertible and (AB)-^ = B'^A'K Thus by induction, (AiA2- • -An^^ = An"* • • -Az^Ai^ where the Ai are invertible. (AB)(B-iA-i) = A(BB-i)A-i = A/A-i = AA i = / and (B-iA-i)(AB) = B-i(A-iA)B = B-^B = B^B = I Thus (AB)-i = B-iA-i. 3.31. Let u and v be distinct vectors. Show that, for each scalar kGK, the vectors u + k{u — v) are distinct. It suffices to show that if u + ki{u — v) = M + k2(u — v) (1) then fcj = k^. Suppose (1) holds. Then ki(u — v) = k^iu — v) or {ki — k2)(u — v) = Since u and v are distinct, u — v¥'0. Hence fci — fcg — and fci = /Cj. ELEMENTARY MATRICES AND APPLICATIONS* 3.32. A matrix obtained from the identity matrix by a single elementary row operation is called an elementary matrix. Determine the 8-square elementary matrices corre- sponding to the operations Ri <^ R2, Ra ^ —IRs and 722 -* — 3i?i + R2. /I o\ Apply the operations to the identity matrix /g = 1 to obtain \o 1/ £^1 = 1 , ^2 = /I 1 \o -7 Eo = 3.33. Prove: Let e be an elementary row operation and E the corresponding m-square elemen- tary matrix, i.e. E-e(l-m). Then for any TO X % matrix A, e{A) = EA. That is, the re- sult e(A) of applying the operation e on the matrix A can be obtained by multiplying A by the corresponding elementary matrix E. Let iJj be the tth row of A; we denote this by writing A = (B^ R^). By Problem 3.27, if B is a matrix for which AB is defined, then AB = (R^B, ..., R-^B). We also let ej = (0, ...,0,1,0 0), A = i *This section is rather detailed and may be omitted in a first reading. It is not needed except for certain results in Chapter 9 on determinants. CHAP. 3] MATRICES 57 Here a = t means that 1 is the ith component. By Problem 3.28, e^A = iJj. We also remark that / = (cj, . . ., e„) is the identity matrix. (i) Let e be the elementary row operation i^j «-> Rj. Then, for a = i and A — j, E - e(I) = (ej Bj ej, . . ., ej and e(A) = (iBj, . . .,£^, . . ., Rt BJ Thus /\ ^ . /s A ^A = (fijA, ...,e^, ...,6iA, ...,e^A) = (fii, . . ., i?,-, . . ., ffj fij = e(A) (ii) Now let e be the elementary row operation jB^ -> fcjBj, fc t^ 0. Then, for a = i, E = e(/) = (6i, ...,fcej, ..., ej and e(A) = (ftj, . . ., fcfij, . . ., BJ /\ /\ Thus ^-A = (fijA, ...,A;ejA, ...,e^A) = (fij, . . . , fefij, . . . , «„) = e(A) (iii) Lastly, let e be the elementary row operation JBj ^ kRj + Kj. Then, for /\ — i, E = e(I) = (ei, ...,fcej + ej, ...,6j and e(A) =: (fij, . . . , fcfij + Bj, . . . , i2 J Using (ftej + ej)A = fc(ej.A) + BjA - kRj + Rf, we have EA = (M, ...,(fce^ + ei)A, ...,e„A) = (R^, . . ., kRj + Ri, . . ., RJ = e(A) Thus we have proven the theorem. 3^. Show that A is row equivalent to B if and only if there exist elementary matrices El, ...,Es such that Es- ■ • E2E1A = B. By definition, A is row equivalent to B if there exist elementary row operations ej, . . ..e, for which es(---(e2(ei(A)))- ••) = B. But, by the preceding problem, the above holds if and only if Eg- • -E^EiA = B where JS7j is the elementary matrix corresponding to e^. 3^5. Show that the elementary matrices are invertible and that their inverses are also elementary matrices. Let E be the elementary matrix corresponding to the elementary row operation e: e(I) = E. Let e' be the inverse operation of e (see Problem 3.21) and E' its corresponding elementary matrix. Then, by Problem 3.33, / = e'(e(/)) = e'E = E'E and / = e(e'(I)) = eE' = EE' Therefore E' is the inverse of E. 3M. Prove that the following are equivalent: (i) A is invertible. (ii) A is row equivalent to the identity matrix /. (iii) A is a product of elementary matrices. Suppose A is invertible and suppose A is row equivalent to the row reduced echelon matrix B. Then there exist elementary matrices Ei,E2, ■ ■ -yE^ such that Eg- • ■E2E1A = B. Since A is invert- ible and each elementary matrix E^ is invertible, the product is invertible. But if B ^ I, then B has a zero row (Problem 3.47); hence B is not invertible (Problem 3.29). Thus B = I. In other words, (i) implies (ii). Now if (ii) holds, then there exist elementary matrices E^, E^, . . .,Eg such that E.-'-E^EiA = /, and so A = (E,- ■ -E^Ei)-^ = E^^E^^-'-EJ^ By the preceding problem, the Ei are also elementary matrices. Thus (ii) implies (iii). Now if (iii) holds (A = EiE^- ■ -E^), then (i) must follow since the product of invertible matrices is invertible. 58 MATRICES [CHAP. 3 3.37. Let A and B be square matrices of the same order. Show that if AB = I, then B = A-K Thus AB = I if and only if BA = I. Suppose A is not invertible. Then A is not row equivalent to the identity matrix /, and so A is row equivalent to a matrix with a zero row. In other words, there exist elementary matrices El E^ such that E^- • -E^E^A has a zero row. Hence E^- ■ -EJE^AB has a zero row. Accordingly, AB is row equivalent to a matrix with a zero row and so is not row equivalent to /. But this con- tradicts the fact that AB = /. Thus A is invertible. Consequently, B = IB = (A'-->^A)B = A-HAB) = A'^I = A"! 1.38. Suppose A is invertible and, say, it is row reducible to the identity matrix / by the sequence of elementary operations ei, . . ., e„. (i) Show that this sequence of elemen- tary row operations applied to / yields A-K (ii) Use this result to obtain the inverse /I 2\ of A = 2 -1 3 \4 1 8/ (i) Let Ei be the elementary matrix corresponding to the operation ej. Then, by hypothesis and Problem 3.34, E„- • -E^EiA = I. Thus (E„- ■ ■EiEJ)A ^ I and hence A i = E^---EJEJ In other words, A -i can be obtained from / by applying the elementary row operations ej, (ii) Form the block matrix (A, I) and row reduce it to row canonical form: 2 I 1 0\ /l 2 (A, /) = [2-1 3 I 1 I to I -1 1 8 I e« to 2 I 1 -1 I -2 1 -1 I -6 1 to Observe that the final block matrix is in the form (/, B). Hence A is invertible and B is its inverse: A-i = Remark: In case the final block matrix is not of the form (/, B), then the given matrix is not row equivalent to I and so is not invertible. Supplementary Problems MATRIX OPERATIONS In Problems 3.39-3.41, let -(o1 !)• -"i-l 4 0-3 2 3 3.39. Find: (i) A + B, (ii) A + C, (iii) 3A - 4B. 3.40. Find: (i) AB, (ii) AC, (iii) AZ), (W) BC, (y) BD, (wi) CD. 3.41. Find: (i) A*, (ii) A'C, (iii) IJtA', (iv) B«A, (y) DW. {wi) DDK CHAP. 3] MATRICES 59 61 &2 h h], find (i) ejA, C, C2 C3 C4/ 3.43. Let Cj = (0, . ... 0, 1, 0, .... 0) where 1 is the ith component. Show the following: (i) Be*. = Cj, the ith column of B. (By Problem 3.28, ejA = Bj.) (ii) If e^A = ejB for each i, then A = B. (iii) If Ae\ = Be\ for each i, then A = B. ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS 3.44. Reduce A to echelon form and then to its row canonical form, where /l 2-1 2 l\ /2 3-2 5 1^ (i) A = 2 4 1-2 3 , (ii) A = 3 -1 2 4 \3 6 2-6 5/ \4 -5 6 -5 Ij 3.45. Reduce A to echelon form and then to its row canonical form, where /l 3 -1 2\ /O 1 3 -2^ (i) A 11 -5 3 ,..^ ^ 4-13 2-5 3 1 ■ (") ^=0021 \4 1 1 5/ \0 5 -3 4, 3.46. Describe all the possible 2X2 matrices which are in row reduced echelon form. 3.47. Suppose A is a square row reduced echelon matrix. Show that if A # 7, the identity matrix, then A has a zero row. 3.48. Show that every square echelon matrix is upper triangular, but not vice versa. 3.49. Show that row equivalence is an equivalence relation: (i) A is row equivalent to A; (ii) A row equivalent to B implies B row equivalent to A; (iii) A row equivalent to B and B row equivalent to C Implies A row equivalent to C. SQUARE MATRICES 3.50. "Let A = ( \ . (i) Find A^ and A3, (ii) If /(«) = vfl - Zx^ - 2x + i, find /(A), (iii) If g(x) = a;2 - a; - 8, find g(A). 3.51. Let B = f ]. (i)U f(x) = 2x2 - 4x + Z, find f{B). (ii) If g{x) = x^ - 4x - 12, find g(B). (iii) Find a nonzero column vector u = ( ] such that Bu = 6m. 3.52. Matrices A and B are said to commute if AB = BA. Find all matrices ( ^ ) which commute . , /I 1\ Vz w/ with ( VO 1 3.53. Let A = ( ) . Find A". <ll) 60 MATRICES [CHAP. 3 3.54. Let A = ( „ „ and B = ( „ ,, \0 3/ \0 11 Find: (i) A + B, (ii) AB, (iii) A^ and A3, (iv) A", (v) /(A) for a polynomial f{x). 3.56. Suppose the 2-square matrix B commutes with every 2-square matrix A, i.e. AB = BA. Show that B = ( , ) for some scalar k, i.e. B is a scalar matrix. \0 kj 3.57. Let Dfc be the m-square scalar matrix with diagonal elements k. Show that: (i) for any mXn matrix A, D^A = kA; (ii) for any nXm matrix B, BD^ = kB. 3.58. Show that the sum, product and scalar multiple of: (i) upper triangular matrices is upper triangular; (ii) lower triangular matrices is lower triangular; (iii) diagonal matrices is diagonal; (iv) scalar matrices is scalar. INVERTIBLE MATRICES - -' '2 -3\ 3.59. Find the inverse of each matrix: (i) ( ^ c ) > (") ( i 3/ -1 2 -3\ 2 1 -V 3.60. Find the inverse of each matrix: (i) | 2 1 , (ii) 2 1 ,4-2 5/ \5 2 -3/ 1 3 4\ t.61. Find the inverse of | 3 -1 6 l-l 5 1/ 3.62. Show that the operations of inverse and transpose commute; that is, (A«)-i = (A-i)«. Thus, m particular, A is invertible if and only if A* is invertible. (»! ... °. ."! ///. . .M invertible, and what is its inverse? 3.64. Show that A is row equivalent to B if and only if there exists an invertible matrix P such that B = PA. 3.65. Show that A is invertible if and only if the system AX = has only the zero solution. MISCELLANEOUS PROBLEMS 3.66. Prove Theorem 3.2: (iii) (B + QA = BA + CA; (iv) k(AB) = (kA)B = A(kB), where fc is a scalar. (Parts (i) and (ii) were proven in Problem 3.11 and 3.12.) 3.67. Prove Theorem 3.3: (i) (A + B)* = A* + BH (ii) (A')« = A; (iii) (feA)' = kA*, for k a scalar. (Part (iv) was proven in Problem 3.16.) 3.68. Suppose A = (A^) and B = (B^,) are block matrices for which AB is defined and the number of columns of each block Aj^ is equal to the number of rows of each block B^j. Show that AB - (Gy) where Cy = 2 Ag^B^j. CHAP. 3] MATRICES 61 3.69. The following operations are called elementary column operations: [El] [^3] Interchange the tth column and the jth column. Multiply the tth column by a nonzero scalar A;. Replace the ith column by k times the jth column plus the ith column. Show that each of the operations has an inverse operation of the same type. 3.70. A matrix A is said to be equivalent to a matrix BUB can be obtained from A by a finite sequence of operations, each being an elementary row or column operation. Show that matrix equivalence is an equivalence relation. 3.71. Show that two consistent systems of linear equations have the same solution set if and only if their augmented matrices are row equivalent. (We assume that zero rows are added so that both aug- mented matrices have the same number of rows.) Answers to Supplementary Problems 3.39. « i-l -1 1) (ii) Not defined. (iii) f -13 4 -3 18 \ 17 0/ 3.40. (i) Not defined. <"« C) <^' CI /-5 -2 4 (") ( 11 -3 -12 «') <"> m ": 8 1) (vi) Not 1 0\ / 4 -7 4\ / 4 -2 3.41. (i) I -1 3 I (ii) Not defined. (iii) (9, 9) (iv) 0-6-8 (v) 14 (vi) -2 1-3 2 4/ \-3 12 6/ \ 6 -3 9y 3.42. (i) (ai, ag. 03. a*) (ii) (61, 62. K K) (iii) (Ci. Cg, Cg, C4) '1 2 4/3^ 3.44. (i) ( 3-6 1 | and ( 1 ^0 1 -1/6^ /2 3-2 5 l\ /l 4/11 5/11 13/11 \ (ii) -11 10 -15 5 and 1 -10/11 15/11 -5/11 \0 0/ \0 / 3.45. (i) r''-° " and /I 3 -1 2 11 -5 3 ^0 0/ /o 1 3 -2 -13 11 35 0/ (ii) L : and ll 4/11 13/11 1 -5/11 3/11 1 o\ 1 1 0/ 62 MATRICES [CHAP. 3 '■''■ Co o)'(2 D'Co I) ''Co J) -^^'■^ '^ '^ ^"^ «'=^'^" ^0 1 V 3.48. ( 1 1 ) Is upper triangular but not an echelon matrix. iO 1/ 3.52. Only matrices of the form ( ] commute with (^ ^J ■ 3.53. /I 2n\ ^" = (o i) 3.54. /9 « ^+^ = (o 14 „ /14 ON « - = c :)■ - = c .;) '" '<^' - (T ;,; 2" (»)^^=(o 33; ('^^ ^"=U 3«, /3ci 3d,^ / 5 -2\ ,.., / 1/3 1/3 3.59. (1) (^ 3) (n) f ^/9 2/g 1-5 4 -3\ / 8 -1 -3^ 3.60. (i) 10-7 6 11 -5 12 8-6 5/ \ 10 -1 -4y /31/2 -17/2 -11^ 3.61. 9/2 -5/2 -3 \-7 4 51 3.62. Given AA-i = /. Then 7 = 7' = (AA-i)' = {A'^YAK That is, (A^^)' = (A*)"!. /a-i .0 a-i ... 3.63. A is invertible iff each aj 9^ 0. Then A ^ - ' \0 chapter 4 Vector Spaces and Subspaces INTRODUCTION In Chapter 1 we studied the concrete structures B" and C" and derived various proper- ties. Now certain of these properties will play the role of axioms as we define abstract "vector spaces" or, as it is sometimes called, "linear spaces". In particular, the conclu- sions (i) through (viii) of Theorem 1.1, page 3, become axioms [A]]-[A4], [Afi]-[M4] below. We will see that, in a certain sense, we get nothing new. In fact, we prove in Chapter 5 that every vector space over R which has "finite dimension" (defined there) can be identified with R" for some n. The definition of a vector space involves an arbitrary field (see Appendix B) whose elements are called scalars. We adopt the following notation (unless otherwise stated or implied): K the field of scalars, a, &, c or A; the elements of K, V the given vector space, u, V, w the elements of V. We remark that nothing essential is lost if the reader assumes that K is the real field R or the complex field C. Lastly, we mention that the "dot product", and related notions such as orthogonality, is not considered as part of the fundamental vector space structure, but as an additional structure which may or may not be introduced. Such spaces shall be investigated in the latter part of the text. Definition : Let iiT be a given field and let V be a nonempty set with rules of addition and scalar multiplication which assigns to any u,v GV a sum u + v GV and to any uGV,kGK a product ku G V. Then V is called a vector space over K (and the elements of V are called vectors) if the following axioms hold: [Ai]: For any vectors u,v,w GV, {u + v) +w = u + {v-i- w). [A2]: There is a vector in V, denoted by and called the zero vector, for which u + Q — u for any vector u GV. [A3] : For each vector uGV there is a vector in V, denoted by —u, for which u + {—u) = 0. [A4]: For any vectors u,v GV, u + v = v + u. [Ml]: For any scalar k G K and any vectors u,v GV, k{u + v) = ku + kv. [M2] : For any scalars a,b GK and any vector u GV, (a + b)u = au + bu. [Ms]: For any scalars a,b G K and any vector u GV, {ab)u = a{bu). [Mi]: For the unit scalar 1 G K, lu = u for any vector u GV. 63 64 VECTOR SPACES AND SUBSPACES [CHAP. 4 The above axioms naturally split into two sets. The first four are only concerned with the additive structure of V and can be summarized by saying that 7 is a commutative group (see Appendix B) under addition. It follows that any sum of vectors of the form Vi + V2 + • • • + Vm requires no parenthesis and does not depend upon the order of the summands, the zero vector is unique, the negative —u of u is unique, and the cancellation law holds: u + w = V + w implies u — v for any vectors u,v,w G V. Also, subtraction is defined by u — V = u + {—v) On the other hand, the remaining four axioms are concerned with the "action" of the field K on V. Observe that the labelling of the axioms reflects this splitting. Using these additional axioms we prove (Problem 4.1) the following simple properties of a vector space. Theorem 4.1: Let 7 be a vector space over a field K. (i) For any scalar kGK and G 7, fcO = 0. (ii) For gK and any vector uGV, Ou = 0. (iii) If ku ^ 0, where kGK and uGV, then A; = or m = 0. (iv) For any scalar kGK and any vector uGV, {-k)u = k{-u) = -ku. EXAMPLES OF VECTOR SPACES We now list a number of important examples of vector spaces. The first example is a generalization of the space R". Examplje 4.1: Let K be an arbitrary field. The set of all n-tuples of elements of K with vector addition and scalar multiplication defined by («!, a2 a„) + (61,62, ...,6„) = (01 + 61,02+62 a„+6„) and fc(ai. <»2. • ■ • . «n) = C^^i- '««2. • • • . ^O where «<, 64, k&K, is a vector space over K; we denote this space by X". The zero vector in K» is the w-tuple of zeros, = (0, 0, ... , 0). The proof that K" is a vector space is identical to the proof of Theorem 1.1, which we may now regard as stating that R" with the operations defined there is a vector space over R. Example 4.2: Let V be the set of all m X n matrices with entries from an arbitrary field K. Then y is a vector space over K with respect to the operations of matrix addition and scalar multiplication, by Theorem 3.1. Example 4.3: Let V be the set of all polynomials Oo + a^t + Ogt^ + • • ■ + a„t" with coefficienis oj from a field K. Then y is a vector space over K with respect to the usual operations of addition of polynomials and multiplication by a constant. Example 4.4: Let K be an arbitrary field and let X be any nonempty set. Consider the set V of all functions from X into K. The sum of any two functions f,g eV is the function f + gGV defined by {f + g){x) = f(x) + g(x) and the product of a scalar kEK and a function / e y is the function kfeV defined by , „ ^ (kf){x) = kf(x) CHAP. 4] VECTOR SPACES AND SUBSPACES 65 Then V with the above operations is a vector space over K (Problem 4.5). The zero vector in V is the zero function which maps each x G X into S K: 0{x) = for every x G X. Furthermore, for any function f G V, —f is that function in V for which (—/)(») = —f(x), for every x G X. Example 45: Suppose S is a field which contains a subfield K. Then E can be considered to be a vector space over K, taking the usual addition in E to be the vector addition and defining the scalar product kv of kGK and v S jF to be the product of k and v as element of the field E. Thus the complex field C is a vector space over the real field E, and the real field R is a vector space over the rational field Q. SUBSPACES Let TF be a subset of a vector space over a field K. W is called a subspace of V if TF is itself a vector space over K with respect to the operations of vector addition and scalar multiplication on V. Simple criteria for identifying subspaces follow. Theorem 4.2: W is& subspace of V if and only if (i) W is nonempty, (ii) W is closed under vector addition: v,w G W implies v + w G W, (iii) W is closed under scalar multiplication: v GW implies kv GW for every kGK. Corollary 4.3: W ia a subspace of V if and only if (i) GW (or W # 0), and (ii) v,w GW implies av + bw G W for every a,b GK. Example 4.6: Let V be any vector space. Then the set {0} consisting of the zero vector alone, and also the entire space V are subspaces of V. Example 4.7: (i) Let V be the vector space R^. Then the set W consisting of those vectors whose third component is zero, W — {{a,b,0) : a,b GR}, is a subspace of V. (ii) Let V be the space of all square nX n matrices (see Example 4.2). Then the set W consisting of those matrices A = (oy) for which ay = Ojj, called symmetric matrices, is a subspace of V. (iii) Let V be the space of polynomials (see Example 4.3). Then the set W consisting of polynomials with degree — n, for a fixed n, is a subspace of V. (iy) Let V be the space of all functions from a nonempty set X into the real field R. Then the set W consisting of all bounded functions in V is a subspace of V. (A function / € V is bounded if there exists M GR such that |/(a;)| - M for every x G X.) Example 4.8: Consider any homogeneous system of linear equations in n unknowns with, say, real coefficients: aiiXi + a-y^Xi 4- • • • + ai„a;„ = a2iXi + a^sx^ + • • • + a2„a;„ — Recall that any particular solution of the system may be viewed as a point in R". The set W of all solutions of the homogeneous system is a subspace of R" (Problem 4.16) called the solution space. We comment that the solution set of a nonhomo- geneous system of linear equations in n unknowns is not a subspace of R". 66 VECTOR SPACES AND SUBSPACES [CHAP. 4 Example 4.9: Let V and W be subspaces of a vector space V. We show that the intersection Vr\W i& also a subspace of V. Clearly G C/ and S W since U and W are sub- spaces; whence e UdW. Now suppose m,v e.Ur\W. Then u,v &U and u,v &W and, since U and W are subspaces, aw + 6i) G ?7 and aw + 6v e W for any scalars a,b€K. Accordingly, au + bv & UnW and so [7nTF is a sub- space of V. The result in the preceding example generalizes as follows. Theorem 4.4: The intersection of any number of subspaces of a vector space 7 is a subspace of V. LINEAR COMBINATIONS, LINEAR SPANS Let F be a vector space over a field K and let vi, ...,VmGV. Any vector in V of the form aiVi + a2V2 4- • • • + amVm where the OiGK, is called a linear combination of vi,...,Vm. The following theorem applies. Theorem 4.5: Let S be a nonempty subset of V. The set of all linear combinations of vectors in S, denoted by L{S), is a subspace of V containing S. Further- more, if W is any other subspace of V containing S, then L{S) CW. In other words, L{S) is the smallest subspace of V containing S; hence it is called the subspace spanned or generated by S. For convenience, we define L{0) = {0}. Example 4.10: Let V be the vector space R3. The linear span of any nonzero vector u consists of all scalar multiples of u; geometrically, it is the line through the origin and the point u. The linear space of any two vectors u and v which are not multiples of each other is the plane through the origin and the points u and v. Example 4.11: The vectors 6i = (1,0,0), eg = (0,1,0) and eg = (0,0,1) generate the vector space R3. For any vector (a, 6, c) G R^ is a linear combination of the ej; specifically, (a, b. e) = a(l, 0, 0) + 6(0, 1, 0) + c(0, 0, 1) = aej + ftej + 063 Example 4.12: The polynomials 1, t, t^, t^, ... generate the vector space V of all polynomials (in*): y = L(l, t, t^ . . .). For any polynomial is a linear combination of 1 and powers of t. CHAP. 4] VECTOR SPACES AND SUBSPACES 67 Example 4.13: Determine whether or not the vector v - (3, 9, -4, -2) is a linear combination of the vectors u^ = (1, -2, 0, 3), U2 == (2, 3, 0, -1) and Wg = (2, -1, 2, 1), i.e. belongs to the space spanned by the Mj. Set r as a linear combination of the Mj using unknowns x, y and z; that is, set V = XU^ + J/M2 + ZM3: (3, 9, -4, -2) = !B(1, -2, 0, 3) 4 i/(2, 3, 0, -1) + «(2, -1, 2, 1) = (x + 2y + 2z, -2x + 3y-z, 2«, 3a; - 3/ + z) Form the equivalent system of equations by setting corresponding components equal to each other, and then reduce to echelon form: X + 2y + 2z = 3 X + 2y + 2z = 3 x + 2y + 2z - 3 -2x + 3j/ - 2 =:= 9 7y + 3z = 15 7y + 3z - 15 or or 22 = -4 2z = -4 22 = -4 3a; - y + z = -2 -7y - 5z = -11 -2z = 4 a; + 2|/ + 2z = 3 or 7y + Bz - 15 22 = -4 Note that the above system is consistent and so has a solution; hence v is a linear combination of the Mj. Solving for the unknowns we obtain x = 1, y = 3, z — —2. Thus V — Ui + 3m2 — 2M3. Note that if the system of linear equations were not consistent, i.e. had no solu- tion, then the vector v would not be a linear combination of the Mj. ROW SPACE OF A MATRIX Let A be an arbitrary mxn matrix over a field K: (Hi ... ai„ \ (I22 . . . a,2n \fflml flm2 . . . dmn/ The rows of A, Rl = (ttll, 0,21, ■ . ., am), . . . , Rm = (Oml, am2, . . . , dmn) viewed as vectors in .K", span a subspace of K" called the row space of A. That is, row space of A — L{Ri, R2, . . . , Rm) Analogously, the columns of A, viewed as vectors in K"", span a subspace of X" called the column space of A. Now suppose we apply an elementary row operation on A, (i) Ri <^ Rj, (ii) Ri ^ kRi, k¥'0, or (iii) Ri -> kRj + Ri and obtain a matrix B. Then each row of B is clearly a row of A or a linear combination of rows of A. Hence the row space of B is contained in the row space of A. On the other hand, we can apply the inverse elementary row operation on B and obtain A; hence the row space of A is contained in the row space of B. Accordingly, A and B have the same row space. This leads us to the following theorem. Theorem 4.6: Row equivalent matrices have the same row space. We shall prove (Problem 4.31), in particular, the following fundamental result con- cerning row reduced echelon matrices. 68 VECTOR SPACES AND SUBSPACES [CHAP. 4 Theorem 4.7: Row reduced echelon matrices have the same row space if and only if they have the same nonzero rows. Thus every matrix is row equivalent to a unique row reduced echelon matrix called its row canonical form. We apply the above results in the next example. Example 4.14: Show that the space U generated by the vectors Ml = (1, 2, -1, 3), M2 = (2, 4, 1, -2), and wg = (3, 6, 3, -7) and the space V generated by the vectors vi = (1, 2, -4, 11) and v^ = (2, 4, -5, 14) are equal; that is, U = V. Method 1. Show that each Mj is a linear combination of v^ and V2, and show that each Vi is a linear combination of Mj, M2 and M3. Observe that we have to show that six systems of linear equations are consistent. Method 2. Form the matrix A whose rows are the Mj, and row reduce A to row canonical form: A = I 2 4 1 -2 I to to Now form the matrix B whose rows are Vi and t>2, and row reduce B to row canonical form: 1 2 -1 3\ 1 2 -1 3 3 -8 to 3 -8 6 -16/ \o 1 2 1/3 "^ 1 -8/3 / ,x 2-4 11\ /I 2 -4 UN /I 2 1/3 B = ( ) to („ „ o _o) to (j, 1 _8/3 _ /I 2-4 11\ /I 2-4 UN ~ \2 4-5 14/ VO 3 -8/ Since the nonzero rows of the reduced matrices are identical, the row spaces of A and B are equal and so U = V. SUMS AND DIRECT SUMS Let U and W be subspaces of a vector space V. The sum of U and W, written U + W, consists of all sums u + w where uGU and w &W: U + W = {u + w:uGU,wGW} Note that = + eU + W, since OeU.OGW. Furthermore, suppose u + w and u' + w' belong \joU + W, with u,u' GU and w,w' e W. Then (u + w) + (u' + w') = {u + u') + {w + w') G U + W and, for any scalar k, k{u + w) = ku + kw G U + W Thus we have proven the following theorem. Theorem 4.8: The sum U + W of the subspaces U and TF of F is also a subspace of V. Example 4.15: Let V be the vector space of 2 by 2 matrices over R. Let U consist of those matrices in V whose second row is zero, and let W consist of those matrices m V whose second column is zero: - = {(::)^«'— }■ - = {(::)■«■-■'} CHAP. 4] VECTOR SPACES AND SUBSPACES 69 Now U and W are subspaces of V. We have: U+W = {(" o) ■ "'^'"^A *"d VnW = [(I °) : aeR That is, U+ W^ consists of those matrices whose lower right entry is 0, and Ur\W consists of those matrices whose second row and second column are zero. Definition: The vector space V is said to be the direct sum of its subspaces U and W, denoted by V = V ® w if every vector v G F can be written in one and only one way as v = u + w where u&V and w gW. The following theorem applies. Theorem 4.9: The vector space V is the direct sum of its subspaces U and W if and only if: {i)V ^ U+W, and (ii) UnW = {0}. Example 4.16: In the vector space R^, let U be the xy plane and let W be the yz plane: U = {{a, 6, 0) : a, 6 S R} and W = {(0, b,c): h,c& R} Then R^ = U+W since every vector in R3 is the sum of a vector in U and a vector in W. However, R* is not the direct sum of U and W since such sums are not unique; for example, (3, 5, 7) = (3, 1, 0) + (0, 4, 7) and also (3, 5, 7) = (3, -4, 0) + (0, 9, 7) Example 4,17: In R3, let U be the xy plane and let W be the z axis: U =: {(o, 6, 0): a,6GR} and W = {(0, 0, c) : c G R} Now any vector (a, b, c) G R^ can be written as the sum of a vector in U and a vector in V in one and only one way: {a, b, c) = (a, 6, 0) + (0, 0, c) Accordingly, R3 is the direct sum of U and W, that is, R^ = U ® W. Solved Problems VECTOR SPACES 4.1. Prove Theorem 4.1: Let F be a vector space over a field K. (i) For any scalar kGK and GV, fcO = 0. (ii) For GK and any vector uGV, Ou = 0. (iii) If ku — 0, where kGK and uGV, then fc = or u = 0. (iv) For any kGK and any uGV, {-k)u = k{-u) = - ku. (i) By axiom [A^] with m = 0, we have + = 0. Hence by axiom [Mi], fcO = fc(0 + 0) = fee + fcO. Adding — kO to both sides gives the desired result. (ii) By a property of K, + = 0. Hence by axiom [Mg], Om = (0 + 0)m = Qu + Ou. Adding - Om to both sides yields the required result. 70 VECTOR SPACES AND SUBSPACES [CHAP. 4 (iii) Suppose fcw = and k ¥= 0. Then there exists a scalar fc^i such that fc~ifc = 1; hence u = lu = {k-^k)u = k-Hku) = fe-iQ = (iv) Using u + {-u) = 0, we obtain = kO = k{u + (-m)) = few + k{-u). Adding -ku to both sides gives —ku — k(—u). Using k + (-fe) = 0, we obtain = Oit = (fe + (-k))u = ku + (-k)u. Adding -ku to both sides yields —ku = {—k)u. Thus (—k)u = k(—u) = —ku. 4.2. Show that for any scalar k and any vectors u and v, k{u-v) = ku- kv. Using the definition of subtraction {u-v = u+ (-■ v)) and the result of Theorem 4.1(iv) {k(—v) = —kv), k(u -v) = k(u + (-v)) = ku + k(-v) = ku + (-kv) = ku - kv 4.3. In the statement of axiom [Mz], (a + b)u = au + bu, which operation does each plus sign represent? The + in (a+b)u denotes the addition of the two scalars a and 6; hence it represents the addi- tion operation in the field K. On the other hand, the + in au+ bu denotes the addition of the two vectors au and bu; hence it represents the operation of vector addition. Thus each + represents a different operation. 4.4. In the statement of axiom [Ma], (ab)u = a{bu), which operation does each product represent? In (ab)u the product ab of the scalars a and 6 denotes multiplication in the field K, whereas the product of the scalar ab and the vector u denotes scalar multiplication. In a{bu) the product bu of the scalar 6 and the vector u denotes scalar multiplication; also, the product of the scalar a and the vector bu denotes scalar multiplication. 4.5. Let V be the set of all functions from a nonempty set X into a field K. For any func- tions f.gGV and any scalar k G K, let f + g and kf be the functions in V defined as follows: {f + 9){x) - fix) + g{x) and (kf){x) = kf(x), yfx G X (The symbol V means "for every".) Prove that 7 is a vector space over K. Since X is nonempty, V is also nonempty. We now need to show that all the axioms of a vector space hold. [All- Let f.g.hGV. To show that (f + g) + h = f + (g + h), it is necessary to show that the function (f + g) + h and the function f + (g + h) both assign the same value to each a; e X. Now, ((f + g) + h)(x) = if + g){x) + h{x) = (fix) + g{x)) + h(x), Vo; G X (f+(g + h))(x) = f(x) + (g + h)(x) = f(x) + (g(x) + h(x)), yfxGX But f(x), g(x) and h(x) are scalars in the field K where addition of scalars is associative; hence (f(x) + g(x)) + h(x) = f(x) + (g(x) + h(x)) Accordingly, (f + g) + h = f +(g + h). [AJ: Let denote the zero function: 0(a;) = 0, Va; G X. Then for any function / G V, (/ + 0)(a;) = f(x) + 0(a!) = f(x) + = f(x), Vo; G X Thus / + = /, and is the zero vector in V. CHAP. 4] VECTOR SPACES AND SUBSPACES 71 [A3]: For any function / G V, let -/ be the function defined by (-/)(«) = - f(x). Then, (/ + (-/))(«) = f(x) + (-/)(*) = f(x) - f(x) = = Oix), yfx&X Hence / + (-/) = 0. [AJ: Let f.g^V. Then (/ + ffKx) = f(x) + gix) = g(x) + f(x) = (g + f)(x), y/x&X Hence f + g = g + f. (Note that /(*) + g(x) = g(x) + f{x) follows from the fact that /(«) and g(x) are scalars in the field K where addition is commutative.) [Ml]: Let f,g&V and k & K. Then W + 9))i.x) = k((f + g)(x)) = k(f(x) + g(x)) = kf(x) + kg(x) = (kf)(x) + (kg)(x) = (kf + kg)(x), ^fxeX Hence k(f + g) = kf + kg. (Note that k(f(x) + g{x)) = kf(x) + kg(x) follows from the fact that k, f(x) and g(x) are scalars in the field K where multiplication is distributive over addition.) [M2]: Let /ey and o, 6 6 X. Then ((a+6)/)(a;) = (a+h)f(x) = af(x) + hfi^x) = (af)(x) + 6/(a;) = (af+hf)(x), VaseX Hence (a + 6)/ = af + bf. [Mg]: Let f&V and a, 6 G X. Then, ({ab)f)(x) = (a6)/(x) = a(6/(a;)) = o(6/)(a;) = (a(6/))(a;), Va; G ;f Hence (ab)f = a(6/). [AfJ: Let / G y. Then, for the unit leK, (!/)(») = l/(a;) = f{x), V« G X. Hence 1/ = /. Since all the axioms are satisfied, y is a vector space over K. 4.6. Let V be the set of ordered pairs of real numbers: V = {{a,b): a,bGR}. Show that V is not a vector space over R with respect to each of the following operations of addition in V and scalar multiplication on V: (i) (a, b) + (c, d) = (a + c,b + d) and k{a, b) = {ka, b); (ii) (a, 6) + (c, d) = (a, 6) and k{a, b) = (fee, kb); (iii) (a, 6) + (c, d) = (o + c, 6 + d) and fe(a, 6) := (fc^a, fe^ft). In each case show that one of the axioms of a vector space does not hold. (i) Let r = l, 8 = 2, v = (3, 4). Then (r + s)v = 3(3,4) = (9,4) rv + sv ^ 1(3, 4) + 2(3, 4) = (3, 4) + (6, 4) = (9, 8) Since (r + s)v ¥= rv + sv, axiom [M^] does not hold. (ii) Let 0) = (1,2), w = (3,4). Then v + w - (1, 2) + (3, 4) = (1, 2) w + v = (3, 4) + (1,2) = (3,4) Since v + w ¥= w + v, axiom [AJ does not hold. (iii) Let r = 1, s = 2, i; = (3, 4). Then (r + s)v = 3(3, 4) = (27, 36) rv + SV = 1(3, 4) + 2(3, 4) = (3, 4) + (12, 16) = (15, 20) Thus {r + s)v ¥= rv + sv, and so axiom [M2] does not hold. 72 VECTOR SPACES AND SUBSPACES [CHAP. 4 SUBSPACES 4.7. Prove Theorem 4.2: Wis a. subspace of V if and only if (i) W is nonempty, (ii) v,w eW implies v + wGW, and (iii) v GW implies kv GW for every scalar kGK. Suppose W satisfies (i), (ii) and (iii). By (i), W is nonempty; and by (ii) and (iii), the operations of vector addition and scalar multiplication are well defined for W. Moreover, the axioms [A,], [AJ, [Ml], [Ma], [Mg] and [MJ hold in W since the vectors in W belong to V. Hence we need only show that [A2] and [A3] also hold in W. By (i), W is nonempty, say uGW. Then by (iii), Ou - S W and v + = v for every v G W. Hence W satisfies [Ag]. Lastly, it v G W then (-l)v = -v £ W and V + (-v) = 0; hence W satisfies [A3]. Thus W is a subspace of V. Conversely, if TF is a subspace of V then clearly (i), (ii) and (iii) hold. 4.8. Prove Corollary 4.3: W ia a subspace of V if and only if (i) e TF and (ii) v,wGW implies av + bw GW for all scalars a,b GK. Suppose W satisfies (i) and (ii). Then, by (i), W is nonempty. Furthermore, if v,w G W then, by (ii), v + w = lv + lweW; and if v&W and kGK then, by (ii), kv = kv + Ove W. Thus by Theorem 4.2, Tf^ is a subspace of V. Conversely, if W is a subspace of V then clearly (i) and (ii) hold in W. 4.9. Let y = R^ Show that W is a subspace of V where: (i) w = {(a, b,0): a,b G R}, i.e. W is the xy plane consisting of those vectors whose third component is 0; (ii) W = {{a,b,c): a + b + c = 0}, i.e. W consists of those vectors each with the property that the sum of its components is zero. (i) = (0, 0,0) e W since the third component of is 0. For any vectors 1; = (a, 6, 0), w = (c, d, 0) in W, and any scalars (real numbers) k and k', kv + k'w = k(a, b, 0) + k'(c, d, 0) = (ka, kb, 0) + (fc'c, k'd, 0) = (ka + k'c, kb + k'd, 0) Thus kv + k'w e W, and so W is a subspace of V. (ii) = (0, 0,0) GW since + + = 0. Suppose v = (a, b, c), w = (a', b', e') belong to W, i.e. tt + 6 + c = and a' + 6' + C = 0. Then for any scalars k and k', kv + k'w = k(a, b, c) + k'(a', b', c') = (ka, kb, kc) + {k'a', k'b', k'c') = (ka + k'a', kb + k'b', kc + k'c') and furthermore, (ka + k'a') + (kb + k'b') + (kc + k'c') = k(a+ b + c) + k'{a' + b' + e') = fcO + fc'O = Thus kv + k'w e W, and so W is a subspace of V. 4.10. Let V = R^ Show that W is not a subspace of V where: (i) PF = {{a, b,c): a ^ 0}, i.e. W consists of those vectors whose first component is nonnegative; (ii) Pf = {(a, b, c): d' + b^ + c^^ 1}, i.e. W consists of those vectors whose length does not exceed 1; (iii) W = {(a, 6, c) : a, b, c e Q}, i.e. W consists of those vectors whose components are rational numbers. In each case, show that one of the properties of, say. Theorem 4.2 does not hold. (i) ,, = (1,2,3) GW and fc = -5 e R. But fc. = -5(1,2,3) = (-5, -10,-15) does not belong to W since -5 is negative. Hence W is not a subspace of V. CHAP. 4] VECTOR SPACES AND SUBSPACES 73 (ii) V = (1, 0,0) eW and w = (0, 1, 0)eW. But v + w = (1, 0, 0) + (0, 1, 0) = (1, 1, 0) does not belong to W since 1^ + 1^ + 0^ = 2 > 1. Hence W Is not a subspace of V. (iii) v = (1,2,3) GW and k = y/2GK. But fcr = \/2 (1,2,3) = (\/2, 2V2, 3\/2) does not belong to W since its components are not rational numbers. Hence W is not a subspace of V. 4.11. Let V be the vector space of all square nxn matrices over a field K. Show that W is a subspace of V where: (i) W consists of the symmetric matrices, i.e. all matrices A = (otj) for which (ii) W consists of all matrices which commute with a given matrix T; that is, W= {AgV: AT = TA}. (i) OSW since all entries of are and hence equal. Now suppose A = (ay) and B = (6y) belong to W, i.e. ftjj = ay and 5jj = 6y. For any scalars a, 6 G if , aA + bB is the matrix whose ii-entry is aa^ + 66y. But aa^j + 66ji = aoy + 66y. Thus aA + 6B is also symmetric, and so TF is a subspace of V. (ii) OeW since or = = TO. Now suppose A,BgW; that is, AT - TA and BT = TB. For any scalars a,b G K, {aA + bB)T = (aA)T + {bB)T = a(AT) + b(BT) = a{TA) + b(TB) = T(aA) + T{hB) = r(aA + 5B) Thus aA + 6B commutes with T, i.e. belongs to W; hence W is a subspace of V. 4.12. Let V be the vector space of all 2 x 2 matrices over the real field R. Show that W is not a subspace of V where: (i) W consists of all matrices with zero determinant; (ii) W consists of all matrices A for which A^ = A. (i) (Recall that det( ^ = ad — be.) The matrices A = f ) and B = f j belong to W since det(A) = and det(B) = 0. But A + B = f j does not belong to W since det (A + B) = 1. Hence W is not a subspace of V. <ll) (ii) The unit matrix i = ( » , ) belongs to W since - = i'l X n = (' ") = ^ /1 Vo 1 /2 0\ But 2/ = ( I does not belong to W since Hence W is not a subspace of V. 4 4 ¥^ 2/ 4.13. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V where: (i) w = {f: /(3) = 0}, i.e. W consists of those functions which map 3 into 0; (ii) W = {f: /(7) = /(!)}, i.e. W consists of those functions which assign the same value to 7 and 1; (iii) W consists of the odd functions, i.e. those functions / for which /(-«) = - /(«)• 74 VECTOR SPACES AND SUBSPACES [CHAP. 4 Here denotes the zero function: 0(a;) = 0, for every x & R. (i) OeW since 0(3) = 0. Suppose f.gGW, i.e. /(3) = and sr(3) = 0. Then for any real numbers a and b, (af + bg)(3) = (1/(3) + bg{3) = aO + 60 = Hence af + bg & W, and so TF is a subspace of V. (11) OeW since 0(7) = = 0(1). Suppose f.g^W, I.e. /(7) = /(I) and «r(7) = flr(l). Then, for any real numbers a and b, (af+bg){7) = af(7) + bg(l) = a/(l) + 6ff(l) = (a/+6fl-)(l) Hence af + bg & W, and so W is a subspace of V. (ill) OeW since 0(-a;) = = -0 = -0(a;). Suppose f,g&W, i.e. /(-x) = -/(«) and g{-x) = — g{x). Then for any real numbers a and b, (a/+6f?)(-a!) = a/(-a;) + 6flr(-a;) = - af(x) - bg{x) = - (a/(x) + 6flr(a;)) = -(af + bg)(x) Hence af + bg €: W, and so W is a subspace of V. 4.14. Let V be the vector space of all functions from the real field R into R. Show that W is not a subspace of V where: (i) W={f: /(7) = 2 + /(!)}; (ii) W consists of all nonnegative functions, i.e. all function / for which f{x) ^ 0, yfxGR. (i) Suppose f.geW, I.e. /(7) = 2 + /(l) and flr(7) = 2 + flr(l). Then (f + g)i^) = /(7) + fl-C?) = 2 + /(I) + 2 + flr(l) = 4 + /(I) + ir(l) = 4 + (/ + flr)(l) ^ 2 + (/ + sf)(l) Hence f + g^W, and so Tl' is not a subspace of V. (11) Let fc = -2 and let / G V be defined by /(a) = x^. Then / G W since /(«) = a;2 s= Q, V« e R. But (fc/)(5) = fe/(5) = (-2)(52) = -50 < 0. Hence kf € W, and so W Is not a sub- space of V. 4.15. Let V be the vector space of polynomials ao + ait + a^t^ + • • • + a„f" with real coef- ficients, i.e. Oi e R. Determine whether or not VF is a subspace of V where: (i) W consists of all polynomials with integral coefficients; (ii) W consists of all polynomials with degree — 3; (iii) W consists of all polynomials &o + &it^ + h^t^ + • • • + bnt^", i.e. polynomials with only even powers of t. (1) No, since scalar multiples of vectors in W do not always belong to W. For example, v = 3 + 5t + 7(2 e ^ but ^i' = f + |« + 1*^ ^ W. (Observe that W is "closed" under vector addition, i.e. sums of elements in W belong to W.) (ii) and (iii). Yes. For, in each case, W is nonempty, the sum of elements in W belong to W, and the scalar multiples of any element in W belong to W. 4.16. Consider a homogeneous system of linear equations in n unknowns Xi, . ..,Xn over a field K: , ^ n anXi + ai2X2 + • ■ ■ + ainXn = aziXl + 022*2 + • ■ • + a2nX„ = OmlXl + am2X2 + • • • + dmnXn = Show that the solution set W is a subspace of the vector space K". = (0, 0, . . . , 0) e PF since, clearly, a«0 + ajjO + • • • + ttinO = 0, for t = 1, . . .,m CHAP. 4] VECTOR SPACES AND SUBSPACES 75 Suppose M = («!, M2, • • . , M„) and v = (vj, Vg, . . . , v„) belong to W, i.e. for i — I, . . .,Tn «il"l + ai2W2 + • ■ • + ttinMn = OjiVi + ai2'y2 + • • • + ai„v„ = Let a and 6 be scalars in K. Then au + bv = (ciMx + 6^1, au2 + 6^2. • ■ • > <*w„ + 6v„) and, for i = 1, . . . , m, aji(aMi + bvj) + ai2(au2 + 6f 2) + • " • + ain(aMn + bv^) = o(ajiMi + OJ2M2 + • • • + «!„«*„) + 6(ajii;i + (ij2i;2 + • • • + aini'n) = aO + 60 = Hence au + 6v is a solution of the system, i.e. belongs to W. Accordingly, W is a subspace of K". LINEAR COMBINATIONS 4.17. Write the vector v = (1, —2, 5) as a linear combination of the vectors ei = (1, 1, 1), 62 = (1,2, 3) and 63 = (2, -1,1). We wish to express v as v = xei + 3/62 + ze^, with x, y and z as yet unknown scalars. Thus we require (1, -2, 5) = x{l, 1, 1) + j/(l, 2, 3) + z(2, -1, 1) = (x, X, x) + (y, 2y, 3y) + (22, -z, z) = (a; + 2/ + 2z, a; + 2j/ — z, a; + 32/ + 2) Form the equivalent system of equations by setting corresponding components equal to each other, and then reduce to echelon form: x+ y + 2z = 1 x + y + 2z = 1 x + y + 2z = 1 X + 2y — z = —2 or j/ - 3z = -3 or y - Sz = S x + 3y + z = 5 2y - z = 4, 5z=10 Note that the above system is consistent and so has a solution. Solve for the unknowns to obtain X = —6, y = B, z — 2. Hence v = — 6ei + 862 + 263. 4.18. Write the vector v = (2, -5, 3) in R^ as a linear combination of the vectors Ci = (1,-3,2), 62 = (2, -4,-1) and 63 = (1,-5, 7). Set V as a linear combination of the Cj using the unknowns x, y and z: v = xe^ + j/eg + zeg. (2, -5, 3) = x{\, -3, 2) + y{2, -4, -1) + z(l, -5, 7) = {x + 2y + z, -3x -4y-5z,2x-y + 7z) Form the equivalent system of equations and reduce to echelon form: x + 2y+z=2 x + 2y+z-2 x + 2y + z = 2 -3x - 4y - 5z = -5 or 2y - 2z = 1 or 2^ - 2z = 1 2x — y + Tz = 3 -5y + 5z = -1 = 3 The system is inconsistent and so has no solution. Accordingly, v cannot be written as a linear com- bination of the vectors Ci, e^ and 63. 4.19. For which value of k will the vector u = (1, -2, k) in R" be a linear combination of the vectors v = (3, 0, -2) and w = (2, -1, -5) ? Set u = XV + yw: (1, -2, fe) = a(3, 0, -2) + j/(2, -1, -5) = (3a; + 2y, -y, -2x - 5y) Form the equivalent system of equations: 3x + 2y = 1, -y = -2, -2x - 5y = k By the first two equations, x = —1, j/ = 2. Substitute into the last equation to obtain k = —8. 76 VECTOR SPACES AND SUBSPACES [CHAP. 4 4^0. Write the polynomial v = t^ + 4t — 3 over R as a linear combination of the poly- nomials ei = t^-2t + 5, 62 = 2t^ - St and ca = t + S. Set D as a linear combination of the ej using the unknowns x, y and z: v = xe^ + ye^ + 263. t2 + 4t - 3 = a;(t2-2t + 5) + 3/(2«2-3t) + 2(f + 3) = a;t2 - 2xt + 5a; + 22/t2 - s^/t + zt + 3z - {x + 2y)fi + {-2x-3y + z)t + (5a; + 3z) Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form: x + 2y = 1 X + 2y = 1 x + 2y = 1 —2x — 3j/+z=4 or 2/+z=6 or 2/ + z=6 hx + 3z = -3 -IQy + 3z = -8 13z = 52 Note that the system is consistent and so has a solution. Solve for the unknowns to obtain a; = -3, 2/ = 2, z = 4. Thus v - -Ssj + ie^ + 463. 4.21. Write the matrix E — I j as a linear combination of the matrices A = ;j).-a?)--(:-x Set £ as a linear combination of A,B,C using the unknowns x,y,z: E — xA + yB + zC. 3 IN /I 1\ /O 0\ /O 2 1 _iy = ^'(i oy + \i 1) + ^o -1 X iKN,/0 0N/0 2z\ / X X + 2z /O 22 \0 -s X 0/ \y VJ \0-«/ \a; + 2/ y — z Form the equivalent system of equations by setting corresponding entries equal to each other: a; = 3, X + y — 1, x + 2z = 1, y — z = —1 Substitute a; = 3 in the second and third equations to obtain y = —2 and z = —1. Since these values also satisfy the last equation, they form a solution of the system. Hence E = BA — 2B — C. 4.22. Suppose m is a linear combination of the vectors Vi, . . .,Vm and suppose each Vi is a linear combination of the vectors Wi, . . . , Wn- u = aiVi + a2V2 + • • • + OmVm and Vi = haWi + baWi + ■ • • + bi„w„ Show that u is also a linear combination of the wu Thus if ScL{T), then L{S) (ZL{T). u = a^Vi + a^v^ + • • • + a^v^ = ai(6iiWi + • • • + 6i„W„) + 02(62l"'l + • ■ • + b2nWn) + • • ' + a^(h.ml1»l + • • • + frmn'"'™) = (diftji + (12621 + • • ■ + am6mi)wi + • • • + (ai6i„ + a^h^n + • • • + aJi^^Wn m m / n \ n / 7n \ or simply u - "S, ^in = 2 ^i ( 2 h'^j ) = 2 ( 2 cuba ) Wj i=l i=l \3=1 / 3=1 \i=l / LINEAR SPANS, GENERATORS 4.23. Show that the vectors u = (1, 2, 3), v = (0, 1, 2) and w = (0, 0, 1) generate W. We need to show that an arbitrary vector (a, 6, c) S R3 is a linear combination of u, v and w. Set (a, 5, c) = xu + yv + zw: (a, b, c) = x(l, 2, 3) + 1/(0, 1, 2) + z(0, 0, 1) = {x, 2x + y,Sx + 2y + z) CHAP. 4] VECTOR SPACES AND SUBSPACES 77 Then form the system of equations X = a « + 2i/ + 3a;=:c 2x + y =6 or y + 2x = h Zx + 2y + z = c X = a The above system is in echelon form and is consistent; in fact x — a, y — b — 2a, z = e — 2b + a is a solution. Thus u, v and w generate R^. 4.24. Find conditions on a, b and c so that (a, b, c) G W belongs to the space generated by u = (2, 1, 0), V = (1, -1, 2) and w = (0, 3, -4). Set (a, 6, c) as a linear combination of u, v and w using unknowns x, y and z: (a, b, c) = xu + yv + zw. {a, b, e) = x(2, 1, 0) + j/(l, -1, 2) + z{0, 3, -4) = {2x + y,x-y + Sz, 2y - 4z) Form the equivalent system of linear equations and reduce it to echelon form: 2x + y =a 2x + y = a 2x + y =a X — y + Zz — b or 3j/ - 6z = a - 26 or Zy - ^z = a - 2b 2y - Az = c 2y - iz = c = 2a - 46 - 3c The vector (a, b, c) belongs to the space generated by u, v and w if and only if the above system is consistent, and it is consistent if and only if 2a - 46 - 3c = 0. Note, in particular, that u, v and w do not generate the whole space R^. 4.25. Show that the xy plane W = {(a, b, 0)} in R^ is generated by u and v where: (i) u = (1, 2, 0) and v = (0, 1, 0); (ii) u = (2, -1, 0) and v = (1, 3, 0). In each case show that an arbitrary vector (a, b,0)eW is a linear combination of u and v. (i) Set (a, b, 0) = xu + yv: {a, b, 0) = x(l, 2, 0) + y(0, 1, 0) = (x, 2x + y, 0) Then form the system of equations X = a y + 2x — b 2x + y = b or x = a = The system is consistent; in fact x = a, y = b-2a is a solution. Hence u and v generate W. (ii) Set (o, 6, 0) = xu + yv: {a, 6, 0) = x{2, -1, 0) + y(l, 3, 0) = (2x + y,-x + 3y, 0) Form the following system and reduce it to echelon form: 2x + y = a 2x + y = a —x + Sy = b or 7y - a + 2b = The system is consistent and so has a solution. Hence W is generated by u and v. (Observe that we do not need to solve for x and y; it is only necessary to know that a solution exists.) 4.26. Show that the vector space V of polynomials over any field K cannot be generated by a finite number of vectors. Any finite set S of polynomials contains one of maximum degree, say m. Then the linear span L{S) of S cannot contain polynomials of degree greater than m. Accordingly, V f^ L{S), for any finite set S. 78 VECTOR SPACES AND SUBSPACES [CHAP. 4 4.27. Prove Theorem 4.5: Let S be a nonempty subset of V. Then L{S), the set of all linear combinations of vectors in S, is a subspace of V containing S. Furthermore, if W is any other subspace of V containing S, then L{S) C W. If V G S, then Iv = v G L{S); hence S is a subset of L{S). Also, L(S) is nonempty since S is nonempty. Now suppose v,wGL(S); say, V — ai^Ui + • • ■ + a^nVm and w — b^Wi + • • ■ + 6„w„ where Vi, w^ G S and fflj, 5j are scalars. Then V + w = a^vi + ■ • • + a^v^ + b^Wi + • • • + 6„w„ and, for any scalar k, kv - A;((iii;i + • • • + a^v^ - ka^v^ + • • • + ka^v^ belong to L(S) since each is a linear combination of vectors in S. Accordingly, L(S) is a subspace of V. Now suppose W is a subspace of V containing S and suppose v^, . . .,v^E. S cW. Then all multiples a^Vi Om'^'m ^ ^> "where Oj G K, and hence the sum a^v^ + • • • + a^Vm ^ ^- That is, W contains all linear combinations of elements of S. Consequently, L(S) c T^ as claimed. ROW SPACE OF A MATRIX 4.28. Determine whether the following matrices have the same row space: Row reduce each matrix to row canonical form: ^ - r. z z) to (J ; \) to (; I I B = f I ; r ) to (' ~ ~ ) to c = (^ 1 5 3 13 (s -1 -2 -2 -3 /I -1 -l' 4 -3 -1 V -1 3/ Vo 1 3> to I 1 3 to I 1 3 I to 2 6/ Since the nonzero rows of the reduced form of A and of the reduced form of C are the same, A and C have the same row space. On the other hand, the nonzero rows of the reduced form of B are not the same as the others, and so B has a different row space. 4.29. Consider an arbitrary matrix A = (a«). Suppose u = (&i, ...,&«) is a linear com- bination of the rows Ri, .. .,Rm of A; say u = kiRi + • • • + kmRm. Show that, for each i, bi — kiaii + feozi + • • • + kmOmi where an, . . ., Omi are the entries of the ith column of A. We are g:iven u = ftjiJi + • • • + k^^R^; hence (6i K) = fci(an. • • • . Om) + • • • + fem(ami. • • • . O-mn) = (feifflii + • • • + fc^Oml) • • • > ^^l^ml + • ■ • + kmO'mn) Setting corresponding components equal to each other, we obtain the desired result. CHAP. 4] VECTOR SPACES AND SUBSPACES 79 4.30. Prove: Let A = (an) be an echelon matrix with distinguished entries aij^, a^^, . . ., ttrj,, and let B — (&«) be an echelon matrix with distinguished entries bik^, &2fc2» • • • . bsk;. Olj ****** \ A = a2i. a,v, / b-i B ^ 4i :]c ^ ifc 4i 9|i Osfc ♦ ^ ♦ V v Suppose A and B have the same row space. Then the distinguished entries of A and of B are in the same position: ji — fci, 32 = kz, . . ., jr = kr, and r = s. Clearly i4 = if and only if B = 0, and so we need only prove the theorem when r — 1 and s — 1. We first show that ij = k^. Suppose ii < k^. Then the j^th. column of B is zero. Since the first row of A is in the row space of B, we have by the preceding problem, Oi^^ = CiO + C2O + • • • + cji = for scalars Cj. But this contradicts the fact that the distinguished element a^ # 0. Hence Ji — k^, and similarly fei — ij. Thus j'l = fcj. Now let A' be the submatrix of A obtained by deleting the first row of A, and let B' be the submatrix of B obtained by deleting the first row of B. We prove that A' and B' have the same row space. The theorem will then follow by induction since A' and B' are also echelon matrices. Let R = («!, 02, . . .,a„) be any row of A' and let R^, ...,B,n ^^ the rows of B. Since R is in the row space of B, there exist scalars d^, . . .,d^ such that R — diRi + ^2^2 + • • • + dmRm- Since A is in echelon form and R is not the first row of A, the ^ith entry of R is zero: aj = for i = ;j = fej. Furthermore, since B is in echelon form, all the entries in the fcjth column of B are except the first: 61^.^ ¥= 0, but 62^1 = ^> ■■■> ^rnkj = 0. Thus = Ofcj = difeifcj + dgO + • • • + d„0 d,b. Now 6ifc # and so di = 0. Thus B is a linear combination of R^, . . .,Bm and so is in the row space of B'. Since R was any row of A', the row space of A' is contained in the row space of B'. Similarly, the row space of B' is contained in the row space of A'. Thus A' and B' have the same row space, and so the theorem is proved. 4.31. Prove Theorem 4.7: Let A = {ckj) and B = (&«) be row reduced echelon matrices. Then A and B have the same row space if and only if they have the same nonzero rows. Obviously, if A and B have the same nonzero rows then they have the same row space. Thus we only have to prove the converse. Suppose A and B have the same row space, and suppose R ¥= is the ith row of A. Then there exist scalars Cj, . . . , c^ such that R - CiRi + C2R2 + • • • + c^Rs W where the Ri are the nonzero rows of B. The theorem is proved if we show that R — R^, or Cj = 1 but Cfc = for A; # t. Let ay be the distinguished entry in R, i.e. the first nonzero entry of R. By (1) and Problem 4.29, ««i = Cl&ljj + «262Ji + • • • + C,b,j. (2) But by the preceding problem 6y. is a distinguished entry of B and, since B is row reduced, it is the only nonzero entry in the ijth column of B. Thus from (2) we obtain Oy^ = Cjfty.. However, oy = 1 and 6y = 1 since A and B are row reduced; hence Cj = 1. Now suppose k ¥= i, and b^j. is the distinguished entry in R^. By (i) and Problem 4.29, a„. = Cibij^ + Hb2j^ + "«fc + C,b,j^ (S) 80 VECTOR SPACES AND SUBSPACES [CHAP. 4 Since B is row reduced, bj^j^ is the only nonzero entry in the i^th column of B; hence by (3), "ijfc — OkHj,^- Furthermore, by the preceding problem a,.j is a distinguished entry of A and, since A is row reduced, a^^ = 0. Thus c^b^j^ = and, since b^j^ = 1, c,, = 0. Accordingly R = R^ and the theorem is proved. 4.32. Determine whether the following matrices have the same column space: /l 3 5\ 112 3^ A = 1 4 3 , B = -2 -3 -4 \1 1 9/ \ 7 12 17y Observe that A and B have the same column space if and only if the transposes A* and B* have the same row space. Thus reduce A' and J5' to row reduced echelon form: A' = 3 4 1 to 1 -2 to 1 -2 to 1 1 1 1 -2 0/ 'l -2 7 1 -2 B* = 2 -3 12 to 1 -2 to 1 -2 to \s -4 17/ \0 2 -4/ \0 0/ Since A* and B* have the same row space, A and B have the same column space. 4.33. Let jR be a row vector and B a matrix for which RB is defined. Show that RB is a linear combination of the rows of B. Furthermore, if A is a matrix for which AB is defined, show that the row space of AB is contained in the row space of B. Suppose i? = (aj, ttg, . . . , a^) and B = (6y). • Let 5i, ...,B^ denote the rows of B and Bi, . . . , B» its columns. Then RB = {R'B^.R'B^, ...,R'B^) = (aibii + 02621 + ^- ttm^ml. ai*12 + 02*22 + • ■ • + am&m2. ■ ■ • . «l&ln + 02*2n ^ 1" am&mn) = ai(6ii, 612, . . . , 6i„) + a2(b2i, 622 &2n) + • • • + am(&ml. 6m2. • • • . bmn) = a^Bi + a252 + • • • + amBm Thus fijB is a linear combination of the rows of B, as claimed. By Problem 3.27, the rows of AB are RiB where i?j is the tth row of A. Hence by the above result each row of AB is in the row space of B. Thus the row space of AB is contained in the row space of B. SUMS AND DIRECT SUMS 4.34. Let U and W be subspaces of a vector space V. Show that: (i) U and W are contained in JJ + W; (ii) U + W \& the smallest subspace of V containing U and W, that is, U + W is the linear span of C/ and PF: C/ + W^ = L(C7, W). (i) Let M G [/. By hypothesis TF is a subspace of V and so d &W. Hence m = m + OS J7+W. Accordingly, U is contained in U +W. Similarly, W is contained in U + W. (ii) Since 17 + W is a subspace of V (Theorem 4.8) containing both U and Tl', it must also contain the linear span of V and W: L{JJ,W) (ZU + W. On the other hand, if v GU +W then v — u + w = lu+lw where uGU and w &W\ hence v is a linear combination of elements in UuW and so belongs to L{'U,W). Thus U + W (Z L(U, W). The two inclusion relations give us the required result. CHAP. 4] VECTOR SPACES AND SUBSPACES 81 4.35. Suppose U and W are subspaces of a vector space V, and that {im} generates U and {Wj) generates W. Show that {Ui,W}), i.e. {Vn} U {Wj), generates U + W. Let V ^U +W. Then v = u-\-w where u G U and w G W. Since {mJ generates ?7, tt is a linear combination of Mj's; and since {Wj} generates W, w is a linear combination of Wj's: u = ffi^j + ciiUi + • • ■ + a„Mj , ttj G K w = b^Wj^ + 62WJ2 + • • • + K'Wj^, bj e K Thus V = u + w = ttiMj + a2Ui + • ■ • + a^Mj + biW, + b2Wj + • • • + fc^Wj and so {mj, w^} generates i7 + TF. 4.36. Prove Theorem 4.9: The vector space V is the direct sum of its subspaces U and W if and only ii (i) V = U + W and (ii) UnW = {0}. Suppose V = U ® W. Then any 1; G V can be uniquely written in the form v = u + w where uG U and w G W. Thus, in particular, V = U + W. Now suppose v G UnW. Then: (1) V - V + where vGU, G W; and (2) v = + v where OGU, v GW Since such a sum for ■« must be unique, v — 0. Accordingly, UCiW = {0}. On the other hand, suppose V - U +W and UnW = {0}. Let vGV. Since V = U + W, there exist uG U and w S W such that v = u + w. We need to show that such a sum is unique. Suppose also that v = u' + w' where u' G U and w' G W. Then u + w = u' + w' and so u — u' = w' — w But u-u'GU and w' - w 6 W; hence by UnW = {0}, u — u' — 0, w' — w = and so u = u', w = w' Thus such a sum for f G V is unique and V = C7 ® TF. 4.37. Let U and PT be the subspaces of R^ defined by U = {{a, b,c): a = b = c} and W = {(0, 6, c)} (Note that PF is the yz plane.) Show that R^ = U ® W. Note first that JJnW' = {0}, for v = (a, b, c) G UnW implies that a = b = c and a = which implies a = 0, 6 = 0, c = i.e. V = (0, 0, 0). We also claim that R^ = U + W. For if 1; = (a, 6, c) S RS, then v = (a, a, a) + (0,b-a,c- a) where (a, a, a) S C7 and (,0,b - a, c- a) GW. Both conditions, UnW = {0} and R3 = U + W, imply R3 = C7 TF. 4.38. Let V be the vector space of ti-square matrices over a field R. Let U and W be the subspaces of symmetric and antisymmetric matrices, respectively. Show that V = U ® W. (The matrix M is symmetric ifi M - M*, and anti-symmetric iff M* = -M.) We first show that V = U + W. Let A be any arbitrary w-square matrix. Note that A = ^(A + A*) + i(A - At) We claim that |(A + A') G 17 and that ^(A - A«) G W. For {^{A+At)y = i(A+A«)' = i(A« + A«) = ^{A+A') that is, -J^CA + A') is symmetric. Furthermore, (^(A-At))' = i(A-A')' :- i(At-A) = -^(A-A'^) that is, J(A — A') is antisymmetric. We next show that UnW = {0}. Suppose MGUnW. Then M = M' and M^ = -M which implies M = -M or M = 0. Hence I7nW = {0}. Accordingly, V= U®W. 82 VECTOR SPACES AND SUBSPACES [CHAP. 4 Supplementary Problems VECTOR SPACES 4.39. Let y be the set of infinite sequences (a^, a^, . . .) in a field K with addition in V and scalar multi- plication on V defined by («!, a2. •••) + (6i, &2. •••) = (ai + 6i, 02+ 62, . . .) k(ai, 02, . . . ) = (fcaj, ka2, . . . ) where aj, bj, k G K. Show that V is a vector space over K. 4.40. Let V be the set of ordered pairs (a, 6) of real numbers with addition in V and scalar multiplication on V defined by (a, 6) + (c, d) = {a+ c,b + d) and k{a, b) = {ka, 0) Show that V satisfies all of the axioms of a vector space except [AfJ : lu = u. Hence [^4] is not a consequence of the other axioms. 4.41. Let V be the set of ordered pairs (a, b) of real numbers. Show that V is not a vector space over R with addition in V and scalar multiplication on V defined by: (i) {a,b) + (c,d) = {a + d,b + c) and k(a, b) = (ka, kb); (ii) (c, 6) + (c, d) = (a + c,b + d) and k(a, b) = (a, 6); (iii) (a, b) + (c, d) = (0, 0) and k{a, b) - {ka, kb); (iv) (a, 6) + (c, d) = (ac, bd) and k{a, b) = (ka, kb). 4.42. Let V be the set of ordered pairs (zi, z^) of complex numbers. Show that V is a vector space over the real field R with addition in V and scalar multiplication on V defined by (zj, Z2) + (wi, W2) = («i + Wi, «2 + '"'2) and k{zi, 22) = {kzi, kz2) where z^, Z2, Wi, W2 ^ C and k GB,. 4.43. Let y be a vector space over K, and let F be a subfield of K. Show that V is also a vector space over F where vector addition with respect to F is the same as that with respect to K, and where scalar multiplication by an element k G F is the same as multiplication by k as an element of K. 4.44. Show that [A4], page 63, can be derived from the other axioms of a vector space. 4.45. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u belongs to U and w to W: V = {{u, w) : uGU, w G W}. Show that y is a vector space over K with addition in V and scalar multiplication on V defined by (u, w) + (u', w') = (u + u',w + w') and k(u, w) = (ku, kw) where u, u' G U, w,w' GW and k G K. (This space V is called the external direct sum of U and W.) SUBSPACES 4.46. Consider the vector space V in Problem 4.39, of infinite sequences (a^, 02, . . .) in a field K. Show that TF is a subspace of V if: (i) W consists of all sequences with as the first component; (ii) W consists of all sequences with only a finite number of nonzero components. 4.47. Determine whether or not W is a subspace of RS if W consists of those vectors (a, b, c) G RS for which: (i) a = 26; (ii) a ^ b ^ c; (iii) ab = 0; (iv) a = b - c; (y) a ^ 6^; (vi) kia + kib + kgfi = 0, where k^ S R. 4.48. Let y be the vector space of w-square matrices over a field K. Show that T^ is a subspace of V if W consists of all matrices which are (i) antisymmetric (A« = -A), (ii) (upper) triangular, (iii) diagonal, (iv) scalar. CHAP. 4] VECTOR SPACES AND SUBSPACES 83 4.49. Let AX = B be a nonhomogeneous system of linear equations in n unknowns over a field K. Show that the solution set of the system is not a subspace of K". 4.50. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V in each of the following cases. (i) W consists of all bounded functions. (Here / : R -^ R is bounded if there exists Af G R such that |/(a;)| ^ M, Va; G R.) (ii) W consists of all even functions. (Here / : R -* R is even if /(— «) = f{x), Va; G R.) (iii) W consists of all continuous functions. (iv) W consists of all difTerentiable functions. (v) W consists of all integrable functions in, say, the interval — a; — 1. (The last three cases require some knowledge of analysis.) 4.51. Discuss whether or not R^ is a subspace of R^. 4.52. Prove Theorem 4.4: The intersection of any number of subspaces of a vector space V is a subspace of y. 4.53. Suppose U and W are subspaces of V for which UuW is also a subspace. Show that either UcW or WcU. LINEAR COMBINATIONS 4.54. Consider the vectors u = (1, -3, 2) and v = (2, -1, 1) in R3. (i) Write (1, 7, —4) as a linear combination of u and v. (ii) Write (2, —5, 4) as a linear combination of u and v. (iii) For which value of k is (1, k, 5) a linear combination of u and vt (iv) Find a condition on a, b and c so that (a, 6, e) is a linear combination of u and v. 4.55. Write m as a linear combination of the polynomials v = 2t^ + 3t — i and w = t^-2t—Z where (i) M = 3*2 + 8« - 5, (ii) M = 4<;2 - 6t - 1. 4.56. Write B as a linear combination of ^ -((._■,) > ^ = (_j n) *"d *^ ~ ( ) where: (i) E = Q "^ ; (ii) E = (J _l^ . LINEAR SPANS, GENERATORS 4.57. Show that (1, 1, 1), (0, 1, 1) and (0, 1, —1) generate R*, i.e. that any vector (a, b, e) is a linear com- bination of the given vectors. 4.58. Show that the yz plane W = {(0, b, c)} in R* is generated by: (i) (0, 1, 1) and (0, 2, -1); (ii) (0, 1, 2), (0, 2, 3) and (0, 3, 1). 4.59. Show that the complex numbers w = 2 + 3t and z -l — 2i generate the complex field C as a vector space over the real field R. 4.60. Show that the polynomials {l-t)», (1 - 1)^, 1 — t and 1 generate the space of polynomials of degree — 3. 4.61. Find one vector in R3 which generates the intersection of U and W where U is the xj/ plane: U = {{a, b, 0)}, and W is the space generated by the vectors (1, 2, 3) and (1, —1, 1). 4.62. Prove: L(S) is the intersection of all the subspaces of V containing S. 84 VECTOR SPACES AND SUBSPACES [CHAP. 4 4.63. Show that L{S) = L(S u{0}). That is, by joining or deleting the zero vector from a set, we do not change the space generated by the set. 4.64. Show that if S c T, then L(S) c L(T). 4.65. Show that LmS)) = L(S). ROW SPACE OF A MATRIX 4.66. Determine which of the following matrices have the same row space: /l -1 3\ .-<: -. :,, -(.ra-O' '^ = 1:-'° (3 -4 5)' 4.67. Let Ml = (1, 1, -1), U2 = (2, 3, -1), M3 = (3, 1, -5) vi = (1,-1,-3), V2 = (3,-2,-8), Vs = (2,1,-3) Show that the subspace of R* generated by the it; is the same as the subspace generated by the Vj. 4.68. Show that if any row of an echelon (row reduced echelon) matrix is deleted, then the resulting matrix is still in echelon (row reduced echelon) form. 4.69. Prove the converse of Theorem 4.6: Matrices with the same row space (and the same size) are row equivalent. 4.70. Show that A and B have the same column space iff A« and B* have the same row space. 4.71. Let A and B be matrices for which AB is defined. Show that the column space of AB is contained in the column space of A. SUMS AND DIRECT SUMS 4.72. We extend the notion of sum to arbitrary nonempty subsets (not necessarily subspaces) S and T of a vector space V by defining S + T = {s + t : sG S, tG T}. Show that this operation satisfies: (i) commutative law: S+ T = T + S; (ii) associative law: (Si + S2) + S3 = Si + (S2 + S3) ; (iii) S + {0} = {0} + S = S; (iv) S + V = V + S = V. 4.73. Show that for any subspace W of a vector space V, W + W = W. 4.74. Give an example of a subset S of a vector space V which is not a subspace of V but for which (i) S + S = S, (ii) S + S C S (properly contained). 4.75. We extend the notion of sum of subspaces to more than two summands as follows. If T^i, W^, . . . , T^„ are subspaces of V, then Wi + W2+---+W„ = {wi + Wi+'-'+w^: WiGWi) Show that: (i) L{Wi, W2 W„) ^ W, + W2+ ■■■ + W„; (ii) if Si generates Wi, i = 1, . . .,n, then Si U S2 U • • • U S„ generates W^ + W2 + • • ■ + Wn- 4.76. Suppose U, V aijd W are subspaces of a vector space. Prove that (UnV) + (UnW) c Un(V+W) Find subspaces of B? for which equality does not hold. CHAP. 4] VECTOR SPACES AND SUBSPACES 85 4.77. Let U, V and W be the following subspaces of R^: U = {(a, b,c): a+b + c = 0}, V = {(a, b,c): a = c}, W = {(0, 0, c) : c e K} Show that (i) R3 = V + V, (ii) B,» =^ V + W, (iii) R^ = V + W. When is the sum direct? 4.78. Let V be the vector space of all functions from the real field B into B. Let U be the subspace of even functions and W the subspace of odd functions. Show that V = U ®W. (Recall that / is even iff f{-x) - f{x), and / is odd iff f(-x) = -f(x).) 4.79. Let Wi, W^, ... be subspaces of a vector space V for which Wy<zW.i,<Z- • ■ . Let PF = Wj U TFj U • • ■ . Show that W is a subspace of Y. 4.80. In the preceding problem, suppose Sj generates W^, i = 1, 2, Show that S = Sj U Sa U ■ • • generates W. 4.81. Let V be the vector space of w-square matrices over a field K. Let U be the subspace of upper triangular matrices and W the subspace of lower triangular matrices. Find (i) U + W, (ii) UnW. 4.82. Let V be the external direct sum of the vector spaces U and W over a field K. (See Problem 4.45.) Let A ys. U = {(m,0): uGU}, W = {(0,w): w & W} Show that (i) U and W are subspaces of V, (ii) V = U ® W. Answers to Supplementary Problems 4.47. (i) Yes. (iv) Yes. (ii) No;e.g. (1,2,3) GW but -2(1, 2, 3) € W'. (v) No; e.g. (9,3,0)eW' but 2(9,3,0) ^W. (iii) No;e.g. (1, 0, 0), (0, 1, 0) G T7, (vi) Yes. but not their sum. 4.50. (1) Let f,g GW with M^ and Mg bounds for / and g respectively. Then for any scalars a, 6 G R, \(af+bg)(x)\ = \af(x) + bg(x)\ ^ \af(x)\ + \bg(x)\ = |a| |/(*)| + |6| |ff(a;)| ^ \a\Mf+\b\Mg That is, \a\Mf + \b\Mg is a bound for the function af + bg. (ii) (af + bg)(-x) = af(-x) + bg(-x) = af(x) + bg(x) = (af + bg)(x} 4.51. No. Although one may "identify" the vector (a, b) G R2 with, say, (a, b, 0) in the xy plane in R3, they are distinct elements belonging to distinct, disjoint sets. 4.54. (i) -3m + 2v. (ii) Impossible, (iii) k = -8. (iv) o - 36 — 5c = 0. 4.55. (i) u — 2v — w. (ii) Impossible. 4.56. (i) E = 2A- B + 2C. (ii) Impossible. 4.61. (2, -5, 0). 4.66. A and C. 4.67. Form the matrix A whose rows are the Mj and the matrix B whose rows are the Uj, and then show that A and B have the same row canonical forms. 4.74. (i) InR2,let S = {(0,0), (0,1), (0,2), (0,3), . . .}. (ii) InR2, let S = {(0,5), (0,6), (0,7), ...}. 4.77. The sum is direct in (ii) and (iii). 4.78. Hint. f(x) = ^(f(x) + f(-x)) + ^f(x) - f(-x)), where ^(f(x) + /(-»)) is even and ^(f(x) - f(-x)) is odd. 4.81. (i) V = U + W. (ii) J7nW is the space of diagonal matrices. chapter 5 Basis and Dimension INTRODUCTION Some of the fundamental results proven in this chapter are: (i) The "dimension" of a vector space is well defined (Theorem 5.3). (ii) If V has dimension n over K, then V is "isomorphic" to K" (Theorem 5.12). (iii) A system of linear equations has a solution if and only if the coefficient and augmented matrices have the same "rank" (Theorem 5.10). These concepts and results are nontrivial and answer certain questions raised and investi- gated by mathematicians of yesterday. We will begin the chapter with the definition of linear dependence and independence. This concept plays an essential role in the theory of linear algebra and in mathematics in general. LINEAR DEPENDENCE Definition: Let F be a vector space over a field Z. The vectors vi, . . .,Vm&V are said to be linearly dependent over K, or simply dependent, if there exist scalars ai, . . .,am&K, not all of them 0, such that aiVi + aiVi + • • ■ + dmVm = (*) Otherwise, the vectors are said to be linearly independent over K, or simply independent. Observe that the relation (*) will always hold if the a's are all 0. If this relation holds only in this case, that is, aiVi + a^Vi + • • • + OmVm - only if ai = 0, . . . , Om = then the vectors are linearly independent. On the other hand, if the relation (*) also holds when one of the a's is not 0, then the vectors are linearly dependent. Observe that if is one of the vectors vi, ...,Vm, say vi = 0, then the vectors must be dependent; for Ivi + Ov2 + • • • + Ot;m = 1 • + + • • • + = and the coefficient of Vi is not 0. On the other hand, any nonzero vector v is, by itself, independent; for , ^ ,a • t i. a *^ kv = 0, V ¥= implies A; = Other examples of dependent and independent vectors follow. Example 5.1: The vectors m = (1,-1,0), 1; = (1,3,-1) and w = (5, 3, -2) are dependent since, for 3m + 2v — w = 0, 3(1, -1, 0) + 2(1, 3, -1) - (5, 3, -2) = (0, 0, 0) 86 CHAP. 5] BASIS AND DIMENSION 87 Example 5.2: We show that the vectors u = (6, 2, 3, 4), v = (0, 5, -3, 1) and w = (0, 0, 7, -2) are independent. For suppose xu + yv + zw = where x, y and z are unknown scalars. Then (0, 0, 0, 0) = x{6, 2, 3, 4) + y{0, 5, -3, 1) + z{0, 0, 7, -2) = (6a;, 2x + 5y, 3x-Sy + Iz, Ax + y- 2z) and so, by the equality of the corresponding components, 6a; =0 2x + hy =0 Zx-Zy + lz - 4a; + y — 2z = The first equation yields a; = 0; the second equation with x = yields y = 0; and the third equation with a; = 0, y — Q yields 2 = 0. Thus xu-'t yv + zw = implies x = 0, y = 0, z = Accordingly u, v and w are independent. Observe that the vectors in the preceding example form a matrix in echelon form: Thus we have shown that the (nonzero) rows of the above echelon matrix are independent. This result holds true in general; we state it formally as a theorem since it will be frequently used. Theorem 5.1: The nonzero rows of a matrix in echelon form are linearly independent. For more than one vector, the concept of dependence can be defined equivalently as follows: The vectors Vi, . . .,Vm are linearly dependent if and only if one of them is a linear combination of the others. For suppose, say, Vi is a linear combination of the others: Vi = aiVi + • • • + Ui-iVi-i + tti+iVi + i + • • • + UrnVm Then by adding —Vi to both sides, we obtain aiVl + • • • + Oi-iVi-i — Vi + Ui + lVi + i + • • • + amVm — where the coefficient of Vi is not 0; hence the vectors are linearly dependent. Conversely, suppose the vectors are linearly dependent, say, biVi + • • • + bjVj + • • • + bmVm = where bj ¥- Then Vj ~ —bi^biVi — ••• — bf^bj-iVj-i — bi^bj+iVj+i — ••• — bi^bmVm and so Vj is a linear combination of the other vectors. We now make a slightly stronger statement than that above; this result has many im- portant consequences. Lemma 5.2: The nonzero vectors Vi, . . .,Vm are linearly dependent if and only if one of them, say vt, is a linear combination of the preceding vectors: Vi — kiVi + kiVi + • • • + fci-iVi-i 88 BASIS AND DIMENSION [CHAP. 5 Remark 1. The set [vi, . . .,Vm} is called a dependent or independent set according as the vectors vi, . . .,Vm are dependent or independent. We also define the empty- set to be independent. Remark 2. If two of the vectors Vi, . . .,Vm are equal, say vi - vz, then the vectors are dependent, For , a., , . n., - n and the coefficient of i;i is not 0. Remark 3. Two vectors Vi and v^ are dependent if and only if one of them is a multiple of the other. Remark 4. A set which contains a dependent subset is itself dependent. Hence any subset of an independent set is independent. Remark 5. If the set {vu . . . , Vm} is independent, then any rearrangement of the vectors {Vij, Vi^, . . . , Vi„} is also independent. Remark 6. In the real space R^ dependence of vectors can be described geometrically as follows: any two vectors u and v are dependent if and only if they lie on the same line through the origin; and any three vectors u, v and w are dependent if and only if they lie on the same plane through the origin: u and V are dependent. u, V and w are dependent. BASIS AND DIMENSION We begin with a definition. Definition: A vector space V is said to be of finite dimension n or to be n-dimensional, written dim V = n, if there exists linearly independent vectors ei, e2, . . . , e„ which span V. The sequence {ei, 62, ... , e„) is then called a basis of V. The above definition of dimension is well defined in view of the following theorem. Theorem 5.3: Let F be a finite dimensional vector space. Then every basis of V has the same number of elements. The vector space {0} is defined to have dimension 0. (In a certain sense this agrees with the above definition since, by definition, is independent and generates {0}.) When a vector space is not of finite dimension, it is said to be of infinite dimension. Example 5.3: Let K be any field. Consider the vector space K" which consists of n-tuples of ele- ments of K. The vectors m n n n n\ ei = (1, 0, 0, . . ., 0, 0) 62 = (0,1,0, ...,0,0) e„ = (0,0,0 0,1) form a basis, called the usual basis, of K". Thus K" has dimension n. CHAP. 5] BASIS AND DIMENSION 89 Example 5.4: Let U be the vector space of all 2 X 3 matrices over a field K. Then the matrices /I 0\ /O 1 ON /O 1\ Vo 0/' i^o 0/' Vo 0/' ( 10 10 1 Example 5.5: form a basis of U. Thus dim C/ = 6. More generally, let V be the vector space of all m X % matrices over K and let E^ S y be the matrix with ly-entry 1 and elsewhere. Then the set {ffy} is a basis, called the usual basis, of V (Problem 5.32); consequently dim V — mn. Let W be the vector space of polynomials (in t) of degree — n. The set {1, t,t^, . . ., t"} is linearly independent and generates W. Thus it is a basis of W and so dim W = n+1. We comment that the vector space V of all polynomials is not finite dimensional since (Problem 4.26) no finite set of polynomials generates V. The above fundamental theorem on dimension is a consequence of the following im- portant "replacement lemma": Lemma 5,4: Suppose the set {vi, V2, . . ., Vn} generates a vector space V. If {wi, . . ., Wm} is linearly independent, then m — n and V is generated by a set of the form {Wi, ...,Wm, Vij, . . . , Vi^_J Thus, in particular, any % + 1 or more vectors in V are linearly dependent. Observe in the above lemma that we have replaced m of the vectors in the generating set by the m independent vectors and still retained a generating set. Now suppose S is a subset of a vector space V. We call {vi, . . ., Vm} a maximal in- dependent subset of S if: (i) it is an independent subset of S; and (ii) {vi, . . .,Vm,w} is dependent for any w e S. The following theorem applies. Theorem 5.5: Suppose S generates V and {vi, . . ., Vm} is a maximal independent subset of S. Then (vi, . . . , Vm} is a basis of V. The main relationship between the dimension of a vector space and its independent subsets is contained in the next theorem. Theorem 5.6: Let V be of finite dimension n. Then: (i) Any set of n + 1 or more vectors is linearly dependent. (ii) Any linearly independent set is part of a basis, i.e. can be extended to a basis, (iii) A linearly independent set with n elements is a basis. Example 5.6: The four vectors in K* (1,1,1,1), (0,1,1,1), (0,0,1,1), (0,0,0,1) are linearly independent since they form a matrix in echelon form. Furthermore, since dim K* = 4, they form a basis of K*. Example 5.7: The four vectors in R3, (257,-132,58), (43,0,-17), (521,-317,94), (328,-512,-731) must be linearly dependent since they come from a vector space of dimension 3. 90 BASIS AND DIMENSION [CHAP. 5 DIMENSION AND SUBSPACES The following theorems give basic relationships between the dimension of a vector space and the dimension of a subspace. Theorem 5.7: Let W he a subspace of an n-dimension vector space V. Then dim W -n. In particular if dim W = n, then W — V. Example 5.8: Let W be a- subspace of the real space B?. Now dim R^ = 3; hence by the preced- ing theorem the dimension of W can only be 0, 1, 2 or 3. The following cases apply: (i) dim W - 0, then W = {0}, a point; (ii) dim W = 1, then W is a line through the origin; (iii) dim W = 2, then T^ is a plane through the origin; (iv) dim W — S, then W is the entire space R^. Theorem 5.8: Let U and W be finite-dimensional subspaces of a vector space V. Then U + W has finite dimension and dim(C7 + F) = dim f/ + dim TF - dim{UnW) Note that if V is the direct sum of U and W, i.e. V = U ® W, then dim V = dim U + dim W (Problem 5.48). Example 5.9: Suppose U and W are the xy plane and yz plane, respectively, in R^: U = {(a, 6,0)}, W = {(0, 6, c)}. Since RS = t/ + I^, dim {U+W) = B. Also, dim 17 = 2 and dim W — 2. By the above theorem, 3 = 2 + 2-dim(f/nTF) or dim{UnW) = 1 Observe that this agrees with the fact that UnW is the y axis, i.e. UnW = {(0, 6, 0)}, and so has dimension 1. z w ^Vr\W ^,/^ V ^^0 ^^ ..^ r ■ RANK OF A MATRIX Let A be an arbitrary m x w matrix over a field K. Recall that the row space of A is the subspace of K" generated by its rows, and the column space of A is the subspace of R^ generated by its columns. The dimensions of the row space and of the column space of A are called, respectively, the row rank and the column rank of A. Theorem 5.9: The row rank and the column rank of the matrix A are equal. Definition: The rank of the matrix A, written rank (A), is the common value of its row rank and column rank. Thus the rank of a matrix gives the maximum number of independent rows, and also the maximum number of independent columns. We can obtain the rank of a matrix as follows. /I 2 -1\ Suppose A = 2 6 -3 -3 . We reduce A to echelon form using the elementary row operations: \ 3 10—6 —5/ 1 2 -1 1 2 -1 2 -3 -1 to 2 -3 -1 4 -6 -2 \o CHAP. 5] BASIS AND DIMENSION 91 A to Recall that row equivalent matrices have the same row space. Thus the nonzero rows of the echelon matrix, which are independent by Theorem 5.1, form a basis of the row space of A. Hence that rank of A is 2. APPLICATIONS TO LINEAR EQUATIONS Consider a system of m linear equations in n unknowns a;i, . . . , «„ over a field K: ttuXl + ai2X2 + • • • + ainXn = bi CziaJi + a22X2 + • • • + ChnXn = ^2 an ai2 ttln bi a2i 022 . a2n 62 ttml ftm2 . . dmn hm dmlXi + am2X2 + • • • + OmnXn — &m or the equivalent matrix equation AX = B where A — (an) is the coefficient matrix, and X = (xi) and B = (6i) are the column vectors consisting of the unknowns and of the constants, respectively. Recall that the augmented matrix of the system is defined to be the matrix {A,B) Remark 1. The above linear equations are said to be dependent or independent according as the corresponding vectors, i.e. the rows of the augmented matrix, are dependent or independent. Remark 2. Two systems of linear equations are equivalent if and only if the corresponding augmented matrices are row equivalent, i.e. have the same row space. Remark 3. We can always replace a system of equations by a system of independent equations, such as a system in echelon form. The number of independent equations will always be equal to the rank of the augmented matrix. Observe that the above system is also equivalent to the vector equation Thus the system AX = B has a solution if and only if the column vector B is a linear combination of the columns of the matrix A, i.e. belongs to the column space of A. This gives us the following basic existence theorem. Theorem 5.10: The system of linear equations AX — B has a solution if and only if the coefficient matrix A and the augmented matrix (A, B) have the same rank. 92 BASIS AND DIMENSION [CHAP. 5 Recall (Theorem 2.1) that if the system AX = B does have a solution, say v, then its general solution is of the form v + W = {v + w: wGW} where W is the general solution of the associated homogeneous system AX = 0. Now W is a subspace of K" and so has a dimension. The next theorem, whose proof is postponed until the next chapter (page 127), applies. Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear equations AX = is n-r where n is the number of unknowns and r is the rank of the coefficient matrix A. In case the system AX = is in echelon form, then it has precisely n-r free variables (see page 21), say, Xi^.xi^, . . .,Xi^_^. Let Vj be the solution obtained by setting aji^. = 1, and all other free variables = 0. Then the solutions Vi, . . ., v^-r are linearly independent (Problem 5.43) and so form a basis for the solution space. Example 5.10: Find the dimension and a basis of the solution space W of the system of linear equations x + 2y- Az + 3r- s = X + 2y -2z + 2r+ s - 2x + iy -2z + 3r + 4s = Reduce the system to echelon form: X + 2y - Az + 3r — s = 2z — r + 2s = and then 6z - 3r + 68 = X + 2y 4z + 3r - s = 2z - r + 2s = There are 5 unknowns and 2 (nonzero) equations in echelon form; hence dim W - 5 — 2 = 3. Note that the free variables are y, r and s. Set: (i) J/ = 1, r = 0, s = 0, (ii) 3/ = 0, r = 1, s = 0, (iii) y = 0, r = 0, s = l to obtain the following respective solutions: vi = (-2,1,0,0,0), V2 = (-1, 0, f 1, 0), vs = (-3,0,-1,0,1) The set {^i, V2, V3} is a basis of the solution space W. COORDINATES Let {ei, . . . , e„} be a basis of an n-dimensional vector space V over a field K, and let v be any vector in V. Since {ei} generates V, 1; is a linear combination of the d: :K ttiCi + aiCi + + dnen, OH Since the d are independent, such a representation is unique (Problem 5.7), i.e. the n scalars ai, . . ., a„ are completely determined by the vector v and the basis {ei}. We call these scalars the coordinates of v in {ei}, and we call the «-tuple (ai, . . ., a„) the coordinate vector of v relative to {ei} and denote it by [v]e or simply [v]: [V]e — (tti, ai, . . ., On) Example 5.11 : Let V be the vector space of polynomials with degree ^ 2: V - {af2 + 5t + c : a, 6, c e R} The polynomials ei = 1, 02 = t-1 and 63 = (t- 1)2 = i^ - 2t + 1 form a basis for V. Let v = 2t2-5t + 6. Find [v\„ the coordinate vector of v relative to the basis {cj, e^, 63}. CHAP. 5] BASIS AND DIMENSION 93 Set V as a linear combination of the e; using the unknowns x, y and z: v = xe^ + yCi + zeg. 2«2 - 5« + 6 = x(\) + y(t - 1) + z(f2 - 2« + 1) = a; + 3/f - 3/ + «t2 _ 2zt + z = zfi + (y- 2z)t + (x-y + z) Then set the coefficients of the same powers of t equal to each other: X — y + z — 6 y -2z = -5 z = 2 The solution of the above system is a: = 3, j/ = —1, z = 2. Thus V = 3ei - 62 + 2e3, and so [v]^ = (3, -1, 2) Example 5.12: Consider the real space tt?. Find the coordinate vector of r = (3, 1, -4) relative to the basis /i = (1, 1. 1), /a = (0, 1, 1), /s = (0, 0, 1). Set r as a linear combination of the /; using the unknowns x, y and z: v = xfi + (3,1,-4) = a;(l, 1, 1) + j/(0, 1, 1) + z(0, 0, 1) ^ (x, X, x) + (0, y, y) + (0, 0, 2) = (x,x + y,x-\-y + z) Then set the corresponding components equal to each other to obtain the equivalent system of equations X =3 X + y =1 X -\- y + z = —4 having solution x = 3, y = -2, z = -5. Thus [v]f = (3, -2, -5). We remark that relative to the usual basis e^ = (1, 0, 0), ej = (0, 1, 0), eg = (0, 0, 1), the coordinate vector of v is identical to v itself: [v]^ = (3, 1, -4) = v. We have shown above that to each vector vGV there corresponds, relative to a given basis {ei, . . ., e„}, an n-tuple [v]e in K\ On the other hand, if (ai, . . .,a„) G j?«, then there exists a vector in V of the form aiCi + • • • + a„e„. Thus the basis {d} determines a one-to- one correspondence between the vectors in V and the w-tuples in K". Observe also that if V = ttiei + • • • + a„e„ corresponds to (ai, . . . , a„) an<i w = biBi + ■ • • + 6„e„ corresponds to (&i, . . . , 6„) then t; + i« = (ai + &,)ei + • • • + (a„ + 6„)e„ corresponds to (ai, . . . , a„) + (bi, . . . , b„) and, for any scalar k G K, kv = (A;ai)ei + • • • + {kan)e„ corresponds to k{ai, . . . , a„) That is, [v + w]e = M« + [w]e and [kv]e = k[v]e Thus the above one-to-one correspondence between V and K" preserves the vector space operations of vector addition and scalar multiplication; we then say that V and K" are isomorphic, written V ^ K"". We state this result formally. Theorem 5.12: Let V be an «-dimensional vector space over a field K. Then V and K^ are isomorphic. 94 BASIS AND DIMENSION [CHAP. 5 The next example gives a practical application of the above result. Example 5.13: Determine whether the following matrices are dependent or independent: ^=a I'D- "^G 3 -4\ c = f ^ ^ ~" 6 5 AJ' U6 10 9 The coordinate vectors of the above matrices relative to the basis in Example 5.4, page 89, are [A] = (1,2,-3,4,0,1), [B] = (1,3,-4,6,5,4), [C] = (3,8,-11,16,10,9) Form the matrix M whose rows are the above coordinate vectors: M = Row reduce M to echelon form: 40l\ /12-3 40 M to I 1-1 2 5 3 to 1-1 2 5 4 10 6/ \00000 Since the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B] and [C] generate a space of dimension 2 and so are dependent. Accordingly, the original matrices A, B and C are dependent. Solved Problems LINEAR DEPENDENCE 5.1. Determine whether or not u and v are linearly dependent if: (i) u = (3,4), V = (1,-3) (iii) u - (4,3,-2), v = (2,-6,7) (ii) u = (2,-3), V = (6,-9) (iv) u = (-4,6,-2), v = (2,-3,1) (V) « = /l-2 4N /2-4 8\ (,i), = /l 2-3N ^/6-5 4 ' [so -1/ [g 0-2/ ^6 -5 i) [l 2 -3 (vii) M = 2 - 5* + 6*2 - i3, V = 3 + 2t-4t^ + 5t^ (viii) u = l-St + 2t^- St\ V = -S + 9t-6t^ + 9«' 3 Two vectors u and v are dependent if and only if one is a multiple of the other. (i) No. (ii) Yes; for v = 3m. (iii) No. (iv) Yes; for u = —2v. (v) Yes; for v = 2m. (vi) No. (vii) No. (viii) Yes; for v = —3m. 5.2. Determine whether or not the following vectors in R^ are linearly dependent: (i) (1,-2,1), (2, 1,-1), (7, -4,1) (iii) (1,2,-3), (1,-3,2), (2,-1,5) (ii) (1, -3, 7), (2, 0, -6), (3, -1, -1), (2, 4, -5) (iv) (2, -3, 7), (0, 0, 0), (3, -1, -4) (i) Method 1. Set a linear combination of the vectors equal to the zero vector using unknown scalars x, y and z: x(l, -2, 1) + 2/(2, 1, -1) + 2(7, -4, 1) = (0, 0, 0) CHAP. 5] BASIS AND DIMENSION 95 Then (x, -2x, x) + (2v, y, -y) + (Iz, -4z, z) = (0, 0, 0) 0^ (x + 2y + 7z, -2x + y - 4:Z, x - y + z) - (0, 0, 0) Set corresponding components equal to each other to obtain the equivalent homogeneous system, and reduce to echelon form: X + 2y + 7z - X + 2y + 7z = o, .„ X + 2y + 7z = -2x + 2/ - 4z = or 5j/ + lOz = or y\-2z = X- y + z = -3j/ - 6z = The system, in echelon form, has only two nonzero equations in the three unknowns; hence the system has a nonzero solution. Thus the original vectors are linearly dependent. Method 2. Form the matrix whose rows are the given vectors, and reduce to echelon form using the elementary row operations: to 5-3 to Since the echelon matrix has a zero row, the vectors are dependent. (The three given vectors generate a space of dimension 2.) (ii) Yes, since any four (or more) vectors in R3 are dependent. (iii) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: to Since the echelon matrix has no zero rows, the vectors are independent. (The three given vectors generate a space of dimension 3.) (iv) Since = (0, 0, 0) is one of the vectors, the vectors are dependent. 5.3. Let V be the vector space of 2 x 2 matrices over R. Determine whether the matrices A,B,C GV are dependent where: (i) Set a linear combination of the matrices A, B and C equal to the zero matrix using unknown scalars x, y and z; that is, set xA + yB + zC = 0. Thus: C :) HID- c :) ^ c :) /x + y + z x + z\ _ f^ ^\ \ X x + yj ~ \0 0) 96 BASIS AND DIMENSION [CHAP. 5 Set corresponding entries equal to each other to obtain the equivalent homogeneous system of equations: X + y + z = X + z = X =0 X + y =0 Solving the above system we obtain only the zero solution, x = 0, y = 0, z = 0. We have shown that xA + yB + zC implies a; = 0, y = 0, z = 0; hence the matrices A,B and C are linearly independent. (ii) Set a linear combination of the matrices A,B and C equal to the zero vector using unknown scalars x, y and z; that is, set xA + yB + zC = 0. Thus: '1 2\ /3 -1\ / 1 -5\ /O ON / X 2x\ /Zy -y\ / z -5z\ _ /O °^ \3x x) '^ \2y 2y) '^ \-Az / ~ 1,0 X + 3y + z 2x-y - 5z\ /O 3x + 2y- 4:Z x + 2y J ~ \0 Set corresponding entries equal to each other to obtain the equivalent homogeneous system of linear equations and reduce to echelon form: X + 3y + z = X + 3y+z = 2x - y — 5z - -ny - 7z = 3x + 2y -4z = or -ly - 7z = x + 2y =0 -y — z = X + Sy + z - y + z = or finally The system in echelon form has a free variable and hence a nonzero solution, for example, X = 2, y = -1, z = 1. We have shown that xA + yB + zC = does not imply that x = 0, j^ = 0, z = 0; hence the matrices are linearly dependent. 5.4. Let V be the vector space of polynomials of degree ^ 3 over R. Determine whether u,v,w gV are independent or dependent where: (i) u ^ i^-Sf^ + Bt + l, V = t^-t^ + St + 2, w = 2*8 - 4*2 + 9t + 5 (ii) u = 1^ + 4t^-2t + S, V = t^ + 6t^-t + 4, w = B1^ + St^ - St + 7 (i) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using unknown scalars x, y and z; that is, set xu + yv + zw = 0. Thus: x(t» - 3*2 + 5t + l) + y(t^ - t2 + 8t + 2) + z{2fi -U^ + 9t + 5) = or xfi - 3xt^ + 5xt + x + yt^ - yt^ + Syt + 2y + 2zt» - izt^ + 9zt + 5z = or (x + y + 2z)t? + {-3x -y- iz)t^ + (5* + 8y + 9z)t + {x + 2y + 5z) = The coefficients of the powers of t must each be 0: X + y + 2z = —3a; — 2/ — 4z = 5x + 8y + 9z - x + 2y + 5z = Solving the above homogeneous system, we obtain only the zero solution: x = 0, y = 0, z = 0; hence u, v and w are independent. CHAP. 5] BASIS AND DIMENSION 97 (ii) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using unknown scalars x, y and z\ that is, set xu + yv + zw = 0. Thus: x{ti + 4«2 - 2t + 3) + 3/(t3 + 6f2 - f + 4) + 2(3*3 + §(2 _ gt + 7) = or xt^ + 4a;t2 - 2xt + Zx + yt» + 6yfi - yt + ^y + 3zt^ + izt^ - Szt + 7z = or (x + y + 3z)i3 + (4a; + 62/ + Sz)t^ + (-2x -y- 8z)t + {3x + 4y + 7z) = Set the coefficients of the powers of t each equal to and reduce the system to echelon form: X + y + 3z = X + y + 3z = 4a; + 62/ + 82 = 2y - 4:Z - or -2x - y - 8z = y -2z = 3x + Ay + 7z = y - 2z = or finally a; + 3/ + 82 = y -2z = The system in echelon form has a free variable and hence a nonzero solution. We have shown that xu + yv + zw = does not imply that x = 0, y = 0, z - 0; hence the polynomials are linearly dependent. 5.5. Let V be the vector space of functions from R into R. Show that f,g,hGV are independent where: (i) f{t) = e^, g{t) = t\ h{t) = t; (ii) f{t) = sint, g{t) = cost, h{t) — t. In each case set a linear combination of the functions equal to the zero function using unknown scalars x, y and z: xf -\- yg + zh = 0; and then show that a; = 0, j/ = 0, z = 0. We emphasize that xf + yg -\- zh - {) means that, for every value of t, xf{t) + yg(t) + zh(t) = 0. (i) In the equation xe^* + yt^ + zt = 0, substitute t = to obtain xe" + j/0 + zO = or a; = * = 1 to obtain xe^ + y -\- z =0 t = 2 to obtain xe* + 4y + 2z = Solve the system - X =0 xe^ + y + z = to obtain only the zero solution: x = 0, y = 0, z — 0. xe* + Ay + 2z = Hence /, g and h are independent. (Ii) Method 1. In the equation a; sin t + 3/ cos « + zt = 0, substitute * = to obtain x-Q + y •! + z'Q = Q or y = Q t = tt/2 to obtain a; • 1 + j/ • + Zff/2 = or a; + 7rzl2 — t = TT to obtain a; • + y(—l) + z • jr = or —y + Trz = V = X + jrz/2 = to obtain only the zero solution: x = 0, y = 0, z = 0. Hence -y + vz = /, g and h are independent. Solve the system Method 2. Take the first, second and third derivatives of x sin t + 2/ cos t + z« = with respect to t to get X cos t — 2/ sin t + z = (1) — X sin t — y cos t = (2) — X cos i + 2/ sin t = (5) 9g BASIS AND DIMENSION [CHAP. 5 Add (1) and (5) to obtain z = 0. Multiply (2) by sin t and (3) by cos t, and then add: sin t X (2): — a; sin^ t — y sin t cost = cos t X (3): —X cos^ t + y sint cos t = -x(sin2 1 + cos2 f) = or a; = Lastly, multiply (2) by — cos t and (5) by sin t; and then add to obtain 2/(cos2 e + sin2 = or y = Since x sint + y cost + zt — implies a; = 0, j/ = 0, a = /, fir and h are independent. 5.6. Let u, V and w be independent vectors. Show that u + v, u-v and u-2v + w are also independent. Suppose x(u + v) + y(u -v) + z(u-2v + w) - where x, y and z are scalars. Then xu + XV + yu — yv + zu — 2zv + «w = or {x + y + z)u+ (x — y — 2z)v + zw = But u, V and w are linearly independent; hence the coefficients in the above relation are each 0: X + y + z = x-y -2z = 2 = The only solution to the above system is x = 0, y = 0, z - 0. Hence u + v, u-v and u-2v + w are independent. 5.7. Let vi, V2, . ..,Vm be independent vectors, and suppose m is a linear combination of the Vi, say u = aiVi + onVz + • • • + OmVm where the ai are scalars. Show that the above representation of u is unique. Suppose u = biVi + b2V2 + • • • + b„v„ where the 6j are scalars. Subtracting, = u — u = (»! — bi)vi + (a2 — 62)^2 + ■ ■ ■ + («m ~ bm)Vm But the Vi are linearly independent; hence the coefficients in the above relation are each 0: ai — bi = 0, a2— b2 = 0, .. ., «« ~" *m = Hence a, = 61, 02 = 62, ..., a^ = b^ and so the above representation of m as a linear combination of the Vi is unique. 5.8. Show that the vectors v = (l+i, 2i) and w = {l,l+i) in C^ are linearly dependent over the complex field C but are linearly independent over the real field R. Recall that 2 vectors are dependent iff one is a multiple of the other. Since the first coordinate of w is 1, I* can be a multiple of w iff v = (l + i)w. But 1 + i € R; hence v and w are independent overR. Since (i+i)w = (l + i)(l, 1 + i) = (1 + 1, 2i) = v and 1 + t G C, they are dependent over C. 5.9. Suppose S = {vi, ...,Vm} contains a dependent subset, say {vi, ...,Vr}. Show that S is also dependent. Hence every subset of an independent set is independent. Since {v^, ...,v^} is dependent, there exist scalars a^, ...,a^, not all 0, such that a^vi + a^v^ + CHAP. 5] BASIS AND DIMENSION 99 Hence there exist scalars aj flr. 0, . . . , 0, not all 0, such that aiVi + • • • + a^Vr + O^r+j + • • ■ + Ov^ = Accordingly, S is dependent. 5.10. Suppose {vi, . . .,Vm} is independent, but {vi, . . .,Vm,w} is dependent. Show that w is a linear combination of the i;;. Method 1. Since {vj, . . . , v^, w} is dependent, there exist scalars «!,..., «„. ^. "o* *11 0, such that Oj^x + • • • + (im'"m + 6w = 0. If 6 = 0, then one of the aj is not zero and Oi^yj + • • • + a^v^ = 0. But this contradicts the hypothesis that {v^, .... i>„} is independent. Accordingly, 6 # and so w = 6-i(-aiVi a„0 = -b-^a^Vi - ■ ■ ■ - b-^a^Vm That is, w is a linear combination of the Vi. Method 2. If w = 0, then w = Ovi + • • • + Ov^. On the other hand, if w ^^ then, by Lemma 5.2, one of the vectors in {dj, . . . , v^, w} is a linear combination of the preceding vectors. This vector cannot be one of the v's since {vj, . . . , v^} is independent. Hence w is a linear combination of the v^. PROOFS OF THEOREMS 5.11. Prove Lemma 5.2: The nonzero vectors Vi, . . .,Vm are linearly dependent if and only if one of them, say vi, is a linear combination of the preceding vectors: Vi = ait;i + • • • + <h-iVi-i. Suppose Vj = aj'Ui + ••• + aj-x-yj-i. Then aiVi + • • ■ + ai_iDi_i - Vi + Oi>j+i + • • • + Ov,n = and the coefficient of vi is not 0. Hence the Vj are linearly dependent. Conversely, suppose the Vj are linearly dependent. Then there exist scalars Oi, . . . , a^, not all 0, such that ai^i + ■ • • + a^Vm = 0. Let k be the largest integer such that o^ ¥= 0. Then a^Vi + • • • + Ofc'Ufc + Ot)fc + i + • ■ • + Ov^ — or a^Vi + • • • + a^v^ — Suppose fe = 1; then a^Vi = 0, a^ ¥= and so v^ = 0. But the Vi are nonzero vectors; hence k > 1 and That is, -y^ is a linear combination of the preceding vectors. 5.12. Prove Theorem 5.1: The nonzero rows Ri, . . .,Rn of a matrix in echelon form are linearly independent. Suppose {RntRn-i, ■ ■ ■,Ri} is dependent. Then one of the rows, say R^, is a linear combination of the preceding rows: ^m — "m+l^m + l + «m + 2-'^m + 2 + ■"■ + <''n^n (*) Now suppose the fcth component of R^ is its first nonzero entry. Then, since the matrix is in echelon form, the fcth components of Rm+u ■ ■ -y^n ^^^ ^^ 0, and so the feth component of (*) is a^+i* + "m+2* + • • • + Ore* = 0. But this contradicts the assumption that the feth component of B„ is not 0. Thus i?i, . . .,Rn are independent. 5.13. Suppose {vi, . . ., Vm} generates a vector space V. Prove: (i) If w GV, then {w, vi, . . ., Vm) is linearly dependent and generates V. (ii) If V{ is a linear combination of the preceding vectors, then [vi, . . .,Vi-uVi+i, . . ., Vm] generates V. (i) If w e y, then w is a linear combination of the Vj since {iij} generates V. Accordingly, {w, Vi, . . ., Vj^} is linearly dependent. Clearly, w with the Vj generate V since the Vf by them- selves generate V. That is, {w,Vi, . . .,v^} generates V. 100 BASIS AND DIMENSION [CHAP. 5 (ii) Suppose Vi = k^Vi + • • • + fei_i'yj_i. Let uGV. Since {Vii generates V, m is a linear com- bination of the Vi, say, u - a^Vi + • • ■ + a^v^. Substituting for Vj, we obtain u = ttiVj + • • • + ai_ii;j_i + aj(fciVi + • • • + ki^^Vi-i) + aj+iVj+i + • • ■ + a^Vm - {ai + aiki)vi + • • • + {ai^i + a^k^_l)v^_l + a^+iVi + i + • • ■ + a^Vm Thus {vi, . . ..t'i-i, 1^1+ 1, • • •,^'ot} generates V. In other words, we can delete Vj from the gen- erating set and still retain a generating set. 5.14. Prove Lemma 5.4: Suppose {vi, . . .,Vn} generates a vector space V. If [wi, . . .,Wm} is linearly independent, then m — n and V is generated by a set of the form {wi, . . .,Wm,vi^, . . .,Vi^_^}. Thus, in particular, any » + l or more vectors in V are linearly dependent. It suffices to prove the theorem in the case that the V; are all not 0. (Prove!) Since the {Vj} generates V, we have by the preceding problem that {wi, Vi, ..., -yj {1) is linearly dependent and also generates V. By Lemma 5.2, one of the vectors in (1) is a linear com- bination of the preceding vectors. This vector cannot be Wj, so it must be one of the v's, say Vj. Thus by the preceding problem we can delete Vj from the generating set {!) and obtain the generating Now we repeat the argument with the vector Wj- That is, since (2) generates V, the set {Wi, W2, Vi, . ..,Vj-i,Vj+i, ..., vJ (5) is linearly dependent and also generates V. Again by Lemma 5.2, one of the vectors in (3) is a linear combination of the preceding vectors. We emphasize that this vector cannot be Wj or Wj since {wi, . . . , w^} is independent; hence it must be one of the v's, say v^. Thus by the preceding problem we can delete v^ from the generating set (3) and obtain the generating set {Wi, W2, Vi, . . ., ■Uj_i, Hj+i, . . ., Vfc_l, ■Vfc + l, . . ., -vJ We repeat the argument with Wg and so forth. At each step we are able to add one of the w's and delete one of the ■y's in the generating set. If m ^ n, then we finally obtain a generating set of the required form: {Wl, ..., Wm. V ■••'■"*n-m^ Lastly, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain the generating set {wi, . . ., w„}. This implies that w„+i is a linear combination of Wj, . . ., w„ which contradicts the hypothesis that {wj} is linearly independent. 5.15. Prove Theorem 5.3: Let 7 be a finite dimensional vector space. Then every basis of V has the same number of vectors. Suppose {ei,e2, ...,e„} is a basis of V, and suppose {fufi, • • •) is another basis of V. Since {e,} generates V, the basis {fi.fz, • • •} must contain n or less vectors, or else it is dependent by the preceding problem. On the other hand, if the basis {fiJ^, ■ ■ ■} contains less than n vectors, then {ej ej is dependent by the preceding problem. Thus the basis {fiJz, ■••} contains exactly n vectors, and so the theorem is true. 5.16. Prove Theorem 5.5: Suppose {vi, ...,Vm} is a maximal independent subset of a set S which generates a vector space V. Then [vi, . ..,Vm} is a basis of V. Suppose weS. Then, since {vj} is a maximal independent subset of S, {■Ui, ...,■«„, w} is linearly dependent. By Problem 5.10, w is a linear combination of the i^j, that is, w G L(Vi). Hence S C L(Vi). This leads to V = L{S) C L(vi) C V. Accordingly, {vJ generates V and, since it is m- dependent, it is a basis of V. CHAP. 5] BASIS AND DIMENSION 101 ^ 5.17. Suppose V is generated by a finite set S. Show that V is of finite dimension and, in particular, a subset of S is a basis of V. Method 1. Of all the independent subsets of S, and there is a finite number of them since S is finite, one of them is maximal. By the preceding problem this subset of S is a basis of V. Method 2. If S is independent, it is a basis of V. If S is dependent, one of the vectors is a linear combination of the preceding vectors. We may delete this vector and still retain a generating set. We continue this process until we obtain a subset which is independent and generates V, i.e. is a basis of V. ^ 5.18. Prove Theorem 5,6: Let V be of finite dimension n. Then: (i) Any set of n + 1 or more vectors is linearly dependent. (ii) Any linearly independent set is part of a basis, (iii) A linearly independent set with n elements is a basis. Suppose {ei, . . . , e„} is a basis of V. (i) Since {ei, ...,«„} generates V, any m + 1 or more vectors is dependent by Lemma 5.4. (ii) Suppose {v^, . . . , v,.} is independent. By Lemma 5.4, V is generated by a set of the form S = {vi, ...,Vr,e,^, ...,ei^_^} By the preceding problem, a subset of S^ is a basis. But S contains n elements and every basis of V contains n elements. Thus S is a basis of V and contains {vi, . . .,v^} as a subset. (iii) By (ii), an independent set T with n elements is part of a basis. But every basis of V contains n elements. Thus, T is a basis. 5.19. Prove Theorem 5.7: Let Whe a subspace of an 7i-dimensional vector space V. Then dim W^n. In particular, if dim W = n, then W=V. Since V is of dimension n, any w + 1 or more vectors are linearly dependent. Furthermore, since a basis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly, dim W — n. In particular, if {w^, . . . , w„} is a basis of W, then since it is an independent set with n elements it is also a basis of V. Thus W = V when dim W = n. 5.20. Prove Theorem 5.8: dim(C7 + W) = dim U + dim W - dim{Ur\W). Observe that U nW is a subspace of both U and W. Suppose dim V — m, dim W = n and dim (Ur\W) = r. Suppose {v^, . . . , vj is a basis of UnW. By Theorem 5.6(ii), we can extend {■«{} to a basis of U and to a basis of W; say, {^1 Vr, Ml, ...,«„_,} and {vi, ..., v„M)i, ..., w„_,} are bases of U and W respectively. Let B - {Vi, ...,V„ Ml, . ..,«„_„ Wi, . . ., W„-r} Note that B has exactly 7111, + n — r elements. Thus the theorem is proved if we can show that B is a basis of U-\-W. Since {v^, Uj} generates U and {v^, w^) generates W, the union B = {vj, Uj, w^} generates U +W. Thus it suffices to show that B is independent. Suppose aiVi + • • • + a^v^ + 61M1 + • ■ • + 6„_,.M„_,. + Ci^i + • • • + c„_,.m;„_,. = (1) where Oj, bj, c^ stte scalars. Let V = aiVi + • • • + a^Vr + 61M1 + • • • + bn-^u^_^ (2) 102 BASIS AND DIMENSION [CHAP. 5 By (1), we also have that ^n — r *^n — r (S) Since {vi,Uj} cU, v e U hy (2); and since {w J c TF, v €: W by (5). Accordingly, vGUnW. Now {Vi) is a basis of UnW and so there exist scalars di, . . . , d, fo"" which v = dj^i + • • • + d^v^. Thus by (^) we have diVi + • • ■ + d^V^ + CiWi + ■ • • + C„_rW„_r = But {vi, Wfc} is a basis of W and so is independent. Hence the above equation forces Cx = 0, . . . , c„_, = 0. Substituting this into (1), we obtain aiVi + • • • + a^Vr + bjUi + • • ■ + b^n-rUm-r — But {f j, Uj} is a basis of U and so is independent. Hence the above equation forces «! = 0, . . . , a, = 0, 61 = 0, . . . , 6to_^ = 0. Since the equation {1) implies that the aj, 6^ and c^ are all 0, B = {vi, Uj, w^} is independent and the theorem is proved. 5.21. Prove Theorem 5.9: The row rank and the column rank of any matrix are equal. Let A be an arbitrary m X w matrix: jail ai2 ■ ■ ■ «in A _ I <*21 "'22 • • • ^2n [O'ml <'m2 • • • '^r Let Ri, B2 Bm denote its rows: i?j = (fflu, O12, • • • . <lin)> • • • > ^m — (<*ml> '*m2> • • • > '*mn) Suppose the row rank is r and that the following r vectors form a basis for the row space: Si = (611, 612, • ■ • , f>i„), S2 - (621. 622. • • • > ^2n). • • • . ^r = i^rV ^r2. • • • > *m) Then each of the row vectors is a linear combination of the S;: R^ ^^ ^11*^1 "■" '^i2'^2 ~r ' ' ' "T" /CjjOj. /?2 — 'l'21'S'l ^" fe22'^2 + ■ ■ ■ + ^2r"r fin» ~ "'ml"'l "^" "'m2"2 + " ' " + l^mr^r where the fcy are scalars. Setting the ith components of each of the above vector equations equal to each other, we obtain the following system of equations, each valid for i = 1, . . . , w: «li = ^^ll^li + ^12621 + • • ■ + ^Ir^ri ttgi = fe21*li + *'22&2i + • • ■ + k2rbri «m« — ^mlbii + fcm2*2i + ' " ' + k^rbr Thus for i = 1, . . . , w: In other words, each of the columns of A is a linear combination of the r vectors ■I'l CHAP. 5] BASIS AND DIMENSION 103 Thus the column space of the matrix A has dimension at most r, i.e. column rank — r. Hence, column rank — row rank. Similarly (or considering the transpose matrix A*) we obtain row rank — column rank. Thus the row rank and column rank are equal. BASIS AND DIMENSION 5.22. Determine whether or not the following form a basis for the vector space R^: (i) (1, 1, 1) and (1, -1, 5) (iii) (1, 1, 1), (1, 2, 3) and (2, -1, 1) (ii) (1, 2, 3), (1, 0, -1), (3, -1, 0) (iv) (1, 1, 2), (1, 2, 5) and (5, 3, 4) and (2, 1, -2) (i) and (ii). No; for a basis of R3 must contain exactly 3 elements, since R^ is of dimension 3. (iii) The vectors form a basis if and only if they are independent. Thus form the matrix whose rows are the given vectors, and row reduce to echelon form: to The echelon matrix has no zero rows; hence the three vectors are independent and so form a basis for R^. (iv) Form the matrix whose rows are the given vectors, and row reduce to echelon form: to The echelon matrix has a zero row, i.e. only two nonzero rows; hence the three vectors are dependent and so do not form a basis for R3. 5.23. Let W be the subspace of R* generated by the vectors (1, —2, 5, -3), (2, 3, 1, —4) and (3, 8, —3, —5). (i) Find a basis and the dimension of W. (ii) Extend the basis of W to a basis of the whole space R*. (i) Form the matrix whose rows are the given vectors, and row reduce to echelon form: /l -2 5 3\ to 7-9 2 to 1 -2 5 3 7 -9 2 14 -18 4 1 -2 5 -3 7 -9 2 The nonzero rows (1, —2, 5, —3) and (0, 7, —9, 2) of the echelon matrix form a basis of the row space, that is, of W. Thus, in particular, dim W = 2. (ii) We seek four independent vectors which include the above two vectors. The vectors (1, —2, 5, —3), (0, 7, —9, 2), (0, 0, 1, 0) and (0, 0, 0, 1) are independent (since they form an echelon matrix), and so they form a basis of R* which is an extension of the basis of W. 5.24. Let W be the space generated by the polynomials V2 = 2«3 -st^' + Qt-l Vi = 2t^ ~5P + 7t + 5 Find a basis and the dimension of W. The coordinate vectors of the given polynomials relative to the basis {f3, t^, t, 1} are respectively [vi] = (1,-2,4,1) K] = (1,0,6,-5) [v^] = (2,-3,9,-1) K] = (2,-5,7,5) 104 BASIS AND DIMENSION [CHAP. 5 Form the matrix whose rows are the above coordinate vectors, and row reduce to echelon form: 1 -2 4 1 1 1 -3 to I „ _ to The nonzero rows (1, —2, 4, 1) and (0, 1, 1, —3) of the echelon matrix form a basis of the space generated by the coordinate vectors, and so the corresponding polynomials «3 _ 2*2 + 4t + 1 and t^ + t - 3 form a basis of W. Thus dim W = 2. 5.25. Find the dimension and a basis of the solution space W of the system x + 2y + 2z- s + St = x + 2y + 3z + s + i = Sx + 6y + 8z + s + 5t = Reduce the system to echelon form: X + 2y + 2z - s + 3t = ^ x + 2y + 2z- s + 3t = z + 2s -2t = or z + 2s - 2t = 2z + 4s - 4« = The system in echelon form has 2 (nonzero) equations in 5 unknowns; hence the dimension of the solution space PT is 5 — 2 = 3. The free variables are y, s and t. Set (i) y = l,s = 0,t-0, (ii) J/ = 0, s = 1, t = 0, (iii) y = 0, s^O, t-l to obtain the respective solutions vi = (-2, 1, 0, 0, 0), V2 = (5, 0, -2, 1, 0), ^3 = (-7, 0, 2, 0, 1) The set {v^, V2, f 3} is a basis of the solution space W. 5.26. Find a homogeneous system whose solution set W is generated by {(1, -2, 0, 3), (1, -1, -1, 4), (1, 0, -2, 5)} Method 1. Let v = (x, y, z, w). Form the matrix M whose first rows are the given vectors and whose last row is v; and then row reduce to echelon form: /I -2 3\ /I -2 3 \ /I -2 3 „ _ I 1 -1 -1 4 r 1 -1 1 ,.0 1 -1 1 IO-25II0 2 -2 2 \^\(iQ2x + y + z-hx-y + w^ \x y \Q 2x + y z -3x + w/ \0 The original first three rows show that W has dimension 2. Thus v&WH and only if the addi- tional row does not increase the dimension of the row space. Hence we set the last two entries in the third row on the right equal to to obtain the required homogeneous system 2x + y + z =0 5x + y —w = Method 2. We know that v = (x,y,z,w) G P7 if and only if d is a linear combination of the gen- erators of W: , „ „ „ ,.v (x, y, z, w) = r(l, -2, 0, 3) + 8(1, -1, -1, 4) + t(l, 0, -2, 5) The above vector equation in unknowns r, s and t is equivalent to the following system: CHAP. 5] BASIS AND DIMENSION 105 r + s + t = X r + 8 + t = oe r + s + t = x -2r -s = y 8 + 2t = 2x + y 8 + 2t = 2x + y -s-2t = z °^ -s-2t = z °^ = 2x + y + z ^^^ 3r + 4s + 5t = w 8 + 2t = w — 3x = 5x + y — w Thus V E. W ii and only if the above system has a solution, i.e. if 2x + y + z =0 5x + y — w = The above is the required homogeneous system. Remark: Observe that the augrmented matrix of the system (1) is the transpose of the matrix M used in the first method. 5.27. Let U and W be the following subspaces of R*: U = {{a,b,c,d): b + c + d = 0}, W = {{a,b,c,d): a + b = 0, c = 2d} Find the dimension and a basis of (i) U, (ii) W, (iii) UnW. (i) We seek a basis of the set of solutions (a, 6, c, d) of the equation b + c + d = or 0-a + b + c + d = The free variables are a, c and d. Set (1) a = 1, c = 0, d = 0, (2) a = 0, c = 1, d = 0, (3) a = 0, c = 0, d = 1 to obtain the respective solutions ^•l = (1,0,0,0), V2 = (0,-1,1,0), 1^3 = (0,-1,0,1) The set {v^, v^, v^ is a basis of U, and dim U = 3. (ii) We seek a basis of the set of solutions (a, 6, c, d) of the system a + 6 = a + 6 = or c = 2d c - 2d = The free variables are 6 and d. Set (i) 6 = 1, d = 0, (2) 6 = 0, d = 1 to obtain the respective solutions Vy = (-1, 1, 0, 0), V2 = (0, 0, 2, 1) The set {■Ui, v^ is a basis of W, and dim W = 2. (iii) U r^W consists of those vectors (a, b, c, d) which satisfy the conditions defining U and the con- ditions defining W, i.e. the three equations 6+c+d=0 a+b =0 0+6 =0 or 6 + c+d = c = 2d c - 2d = The free variable is d. Set d = 1 to obtain the solution v = (3, -3, 2, 1). Thus {v} is a basis of VnW, and Aim (V nW) = \. 5.28. Find the dimension of the vector space spanned by: (i) (1, -2,3, -1) and (1, 1, -2, 3) (v) {\ ^J and ^ (ii) (3, -6, 3, -9) and (-2, 4, -2, 6) / 1 l\ /-3 -3 (iii) t^ + 2f2 + 3i + 1 and 2t^ + 4*^ + 6f + 2 ^^^^ (^-i -i^ (iv) i»-2f2 + 5 and i^ + 34 _ 4 (vii) 3 and -3 Oj \y zj \0 106 BASIS AND DIMENSION [CHAP. 5 Two nonzero vectors span a space W of dimension 2 if they are independent, and of dimension 1 if they are dependent. Recall that two vectors are dependent if and only if one is a multiple of the other. Hence: (i) 2, (ii) 1, (iii) 1, (iv) 2, (v) 2, (vi) 1, (vii) 1. 5.29. Let V be the vector space of 2 by 2 symmetric matrices over K. Show that dim 7 = 3. (Recall that A = (ay) is symmetric iff A = A* or, equivalently, an = an.) I a h\ An arbitrary 2 by 2 symmetric matrix is of the form A = ( 1 where a, 6, c G A. (Note that there are three "variables".) Setting ^ (i) a = 1, 6 = 0, c = 0, (ii) a = 0, 6 = 1, c = 0, (iii) a = 0, 6 = 0, c = l we obtain the respective matrices -. = (;:)• -. = (?;)■ -• = (::) We show that {E^, E^, E^) is a basis of V, that is, that it (1) generates V and (2) is independent. (1) For the above arbitrary matrix A in V, we have A = ("' ^\ = aEi + bE2 + cEs Thus {^1, E^, Eg} generates V. (2) Suppose xEi + yE^ + zE^ = 0, where x, y, z are unknown scalars. That is, suppose Ki o) + ^(i J) + K2 1) - Setting corresponding entries equal to each other, we obtain x = 0, y = 0, z = 0- In other words, xEi + yE2 + zEs = implies x = 0, y = 0, z = Accordingly, {EijEz.Ea} is independent. Thus {El, E2, E3} is a basis of V and so the dimension of V is 3. 5.30. Let V be the space of polynomials in t of degree ^ n. Show that each of the following is a basis of V: (i) {1, «, t^ . . . , «"->, «»}, (ii) {1, 1 - 1, (1 - tf, . . . , (1 - «)"-s (1 - «)"}. Thus dim V = n + 1. (i) Clearly each polynomial in y is a linear combination of l,t, ...,t"-i and t». Furthermore, 1, t, . . ., t"-i and t» are independent since none is a linear combination of the preceding poly- nomials. Thus {1, t t«} is a basis of V. (ii) (Note that by (i), dim V = w+ 1; and so any m-l- 1 independent polynomials form a basis of V.) Now each polynomial in the sequence 1, 1 - t (1 - t)» is of degree higher than the preceding ones and so is not a linear combination of the preceding ones. Thus the w + 1 poly- nomials 1, 1 — t, . . . , (1 — t)» are independent and so form a basis of V. 5.31. Let V be the vector space of ordered pairs of complex numbers over the real field R (see Problem 4.42), Show that V is of dimension 4. We claim that the following is a basis of V: B = {(1, 0), (i, 0), (0, 1), (0, t)} Suppose u e V. Then v = («, w) where z, w are complex numbers, and so v = (a+bi,e + di) where o, 6, c, d are real numbers. Then V = ail, 0) + 6(t, 0) + c(0, 1) + d(0, t) Thus B generates V. CHAP. 5] BASIS AND DIMENSION 107 The proof is complete if we show that B is independent. Suppose xi(l, 0) + X2(i, 0) + a;3(0, 1) + 0:4(0, i) = where Xi, x^, x^, X4 G R. Then {xi + x^, Xs + x^x) = (0, 0) and so Xx -'r x^i = «3 -V x^i — Q Accordingly x-^ = 0, a;2 = 0, as = 0, a;4 = and so B is independent. 5.32. Let Y be the vector space of m x w matrices over a field K. Let E^ G V be the matrix with 1 as the i^-entry and elsewhere. Show that {E-,^ is a basis of 7. Thus dimV = mn. We need to show that {By} generates V and is independent. Let A = (tty) be any matrix in V. Then A = 2 o,^E^^. Hence {By} generates V. Now suppose that 2 ajjii^ij = where the «{.- are scalars. The y-entry of 2 «ij^ij is ««. and the y-entry of is 0. Thus asy = 0, i = 1, . . ., w, j = 1 m. Accordingly the matrices By are independent. Thus {By} is a basis of V. Remark: Viewing a vector in K» as a 1 X w matrix, we have shown by the above result that the usual basis defined in Example 5.3, page 88, is a basis of X" and that dim K" = w. SUMS AND INTERSECTIONS 5.33. Suppose TJ and W are distinct 4-dimensional subspaces of a vector space Y of dimen- sion 6. Find the possible dimensions of TJV^W. Since V and W are distinct, V -VW properly contains 17 and W; hence dim(f7+W)>4. But dim(?7+W) cannot be greater than 6, since dimV = 6. Hence we have two possibilities: (i) dim(U+T7) = 5, or (ii) dim (U + PF) = 6. Using Theorem 5.8 that dim(f7+ T^) = dim U + dim W — dim (Un TF), we obtain (i) 5 = 4 + 4 -dim(f/nW) or dim(t7nW) = 3 (ii) 6 = 4 + 4 -dim(?7nW) or dim(t/nTF) = 2 That is, the dimension of TJ r\'W must be either 2 or 3. 5.34. Let J] and W be the subspaces of R* generated by {(1, 1, 0, -1), (1, 2, 8, 0), (2, 3, 3, -1)} and {(1, 2, 2, -2), (2, 3, 2, -3), (1, 3, 4, -3)} respectively. Find (i) dim (C/ + TF), (ii) dim(C7nW). (i) TJ-^W is the space spanned by all six vectors. Hence form the matrix whose rows are the given six vectors, and then row reduce to echelon form: to to to Since the echelon matrix has three nonzero rows, dim iJJ -VW) — Z. 108 BASIS AND DIMENSION [CHAP. 5 (ii) First find dim U and dim W. Form the two matrices whose rows are the generators of U and W respectively and then row reduce each to echelon form: and 1 1 -1 1 2 3 2 3 3 -1 1 2 2 -2 2 3 2 -3 1 3 4 -3 to to to to 1 1 -1 1 3 1 0. 'l 2 2 -2 -1 -2 1 Since each of the echelon matrices has two nonzero rows, dim V — 2 and dim W = 2. Using Theorem 5.8 that dim (V +W) = dim U + dim W - dim (UnW), we have 3 = 2 +2-dim(!7nW) or Aim{Ur\W) = 1 5.35. Let U be the subspace of R^ generated by {(1, 3, -2, 2, 3), (1, 4, -3, 4, 2), (2, 3, -1, -2, 9)} and let W be the subspace generated by {(1, 3, 0, 2, 1), (1, 5, -6, 6, 3), (2, 5, 3, 2, 1)} Find a basis and the dimension of (i) XJ + W, (ii) f/n W. (i) U +W is the space generated by all six vectors. Hence form the matrix whose rows are the six vectors and then row reduce to echelon form: /I 3-2 2 3\ 1-1 2-1 0-3 3-6 *° ' 2 0-2 2-440 \0 -1 7 -2 -5/ to 1 3 -2 2 3 1 4 -3 4 2 2 3 -1 -2 9 1 3 2 1 1 5 -6 6 3 2 5 3 2 1 1 3 -2 2 3 1 -1 2 -1 2 -2 -2 2 6 -6 to 1 3 -2 2 3 1 -1 2 -1 2 -2 The set of nonzero rows of the echelon matrix, {(1, 3, -2, 2, 3), (0, 1, -1, 2, -1), (0, 0, 2, 0, -2)} is a basis oiV+W; thus dim (t/ + TF) = 3. (ii) First find homogeneous systems whose solution sets are U and W respectively. Form the matrix whose first rows are the generators of U and whose last row is (», y, z, s, t) and then row reduce to echelon form: -2 -1 3 2 2 -6 3 -1 3 -3x + y 2x + z -2x + s Sx + t j 4a; CHAP. 5] BASIS AND DIMENSION 109 Set the entries of the third row equal to to obtain the homogeneous system whose solution set is U: -X + y + z = Q, 4a; - 22/ + 8 = 0, -6a; + y -\- t = Now form the matrix whose first rows are the generators of W and whose last row is (x, y, z, 8, t) and then row reduce to echelon form: to -9aj + 3y + z 'Ix — 2y + s 2x — y + t Set the entries of the third row equal to to obtain the homogeneous system whose solution —9x + 3y + z = 0, 4x — 2y + s = 0, 2x - y + t = Combining both systems, we obtain the homogeneous system whose solution set is U nW: —x+y + z =0 4x -2y +8 =0 -6a; + y + t = -9x + 3y + z - 4a; — 2j/ + s =0 2a; - J/ + t = —x + y+z =0 2y + 4z + 8 =0 8z + 58 + 2t - 4z + 3s =0 s - 2t = —x+y+z — 2y + Az + s - -5y - 6z + t = -6y - 8z =0 2y + iz + s =0 ^ + 2z + t = -x + y+z =0 2y + Az + 8 = 8z + 5s + 2f = 8 - 2t = There is one free variable, which is t; hence dim(l7nT^ = 1. Setting t = 2, we obtain the solution a; = 1, 2/ = 4, z = -3, 8 = 4, t = 2. Thus {(1, 4, -3, 4, 2)} is a basis of UnW. COORDINATE VECTORS 5^6. Find the coordinate vector of v relative to the basis {(1, 1, 1), (1, 1, 0), (1, 0, 0)} of R^ where (i) v = (4, -3, 2), (ii) v = (a, 6, c). In each case set v aa a linear combination of the basis vectors using unknown scalars x, y and z: V = a;(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0) and then solve for the solution vector {x,y,z). (The solution is unique since the basis vectors are linearly independent.) (i) (4,-3,2) = a;(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0) = (a;, X, x) + {y, y, 0) + (z, 0, 0) = (x + y + z,x + y,x) Set corresponding components equal to each other to obtain the system x + y + z = A, X + y = —3, a; = 2 Substitute « = 2 into the second equation to obtain y = —5; then put x = 2, y = —5 into the first equation to obtain z = 7. Thus x = 2, y = -5, z = 7 is the unique solution to the system and so the coordinate vector of v relative to the given basis is [v] = (2, —5, 7). 110 BASIS AND DIMENSION [CHAP. 5 (ii) {a, b, c) - «(1, 1, 1) + 1/(1, 1, 0) + z{l, 0, 0) = (x + y + z,x + y,x) Then x + y + z = a, x + y = b, x — c from which x = c, y — b — c, z = a—b. Thus [v] — (c,b — c,a— b), that is, [(a, b, c)] — {e, b — c, a — b). 5JS7. Let V be the vector space of 2 x 2 matrices over R. Find the coordinate vector of the matrix A GV relative to the basis {{I iHri)^{i-iHi I)} w- -(!-? Set A as a linear combination of the matrices in the basis using unknown scalars x, y, z, w: - C J) "I i)*'(:-i)-'(i-i)*<i : \X x) '^ \y 0/ \0 0/ w ' X + z + w X — y — z^ X + y X Set corresponding entries equal to each other to obtain the system X + z + w = 2, X — y — z = 3, x + y =^ A, x — —1 from which x = -l, y = ll, z- -21, w = 30. Thus [A] - (-7, 11, -21, 30). (Note that the co- ordinate vector of A must be a vector in R* since dim V = 4.) 5^8. Let W be the vector space of 2 x 2 symmetric matrices over R. (See Problem 5.29.) Find the coordinate vector of the matrix ^ — [ , ^ „ ) relative to the basis 1 -2\ /2 1\ / 4 -1\1 ^ ^ -2 1/' vi sy \-i -5/j • Set A as a linear combination of the matrices in the basis using unknown scalars x, y and z: / 4 -11\ /I -2\ , /2 1\ , / 4 -1\ /x + 2y + 4:Z -2x + y - z ^ = (_n -7J = %-2 ij + ^l 3J + %-l -5) = (,-2x + y-. . + 3,-5. Set corresponding entries equal to each other to obtain the equivalent system of linear equations and reduce to echelon form: X + 2y + Az = 4 —2x + y - z — -11 -2x + y - z = -11 X + Sy - 5z = -7 X + 2y + iz = 4 X + 2y + iz - 4 5y + 7z - -3 or 5y + Iz - -3 J/ - 9z = -11 52z - 52 We obtain « = 1 from the third equation, then y = —2 from the second equation, and then a = 4 from the first equation. Thus the solution of the system is x = 4, y — —2, z — 1; hence [A] — (4, —2, 1). (Since dim W = 3 by Problem 5.29, the coordinate vector of A must be a vector in K*.) 5.39. Let {ei, 62, es} and {/i, h, fa} be bases of a vector space V (of dimension 3). Suppose ei = Ci/i + ttaA + 0-3/3 62 — hifi+b^fi + hifz (i) 63 = Ci/i + C2/2 + C3/3 Let P be the matrix whose rows are the coordinate vectors of ei, ea and es respectively, relative to the basis {fi}: CHAP. 5] BASIS AND DIMENSION 111 {tti 02 as \ 61 &2 63 Cl C2 C3 ^ Show that, for any vector v GV, [v]eP = [vy. That is, multiplying the coordinate vector of v relative to the basis {ei} by the matrix P, we obtain the coordinate vector of V relative to the basis {/<}. (The matrix P is frequently called the change of basis matrix,) Suppose V = rei + seg + teg; then [v]^ = (r,8,t). Using (i), we have V = r(aJi + O2/2 + agfa) + si^ifi + ^2/2 + ^3/3) + *(«i/i + "2/2 + "3/3) = (roi + s6i + tci)/i + {ra2 + S62 + fc2)/2 + {ras + sb^ + tcs)^ Hence [v]f — {rai + sbi + tc^, ra2 + sb2+ tC2, ra3 + sb3+ tcg) On the other hand, / „ ai a2 (I3 [v],P = (r, s, t) 61 62 63 \ Cl C2 C3 = (rtti + s6i + tCj, ra2 + 862 + tc2, ra^ + S63 + tcg) Accordingly, [v\eP — ['"]/• Remark: In Chapter 8 we shall write coordinate vectors as column vectors rather than row vectors. Then, by above, 'ax 61 Ci\/r\ / rai + s6i + tCx ' QMe = I a2 62 «2l|S &3 Cg/ \ t , ra2 + 362 + *C2 irfflg + S&3 + tCgy f where Q is the matrix whose columns are the coordinate vectors of ©i, e^ ^^^d H respectively, relative to the basis {/J. Note that Q is the transpose of P and that Q appears on the left of the column vector \v\g whereas P appears on the right of the row vector [1;]^. RANK OF A MATRIX 5.40. Find the rank of the matrix A where: (i) A = /I 3 1 -2 -3 u 4 3 -1 -4 l^ 3 -4 -7 -3 \3 8 1 -7 -8 (ii) A = (iii) A (i) Row reduce to echelon form: /I 1 2 3 to 3 1-2 -3\ 4 3-1-4 3 -4 -7 -3 8 1-7 -8/ Since the echelon matrix has two nonzero rows, rank (A) = 2 to (ii) Since row rank equals column rank, it is easier to form the transpose of A and then row reduce to echelon form: to to Thus rank (A) = 3. 112 BASIS AND DIMENSION [CHAP. 5 (iii) The two columns are linearly independent since one is not a multiple of the other. Hence rank (A) = 2. 5.41. Let A and B be arbitrary matrices for which the product AB is defined. Show that rank (AB)^ rank (J?) and rank (AB) ^ rank (A). By Problem 4.33, page 80, the row space of AB is contained in the row space of B; hence rank (AB) — rank (B). Furthermore, by Problem 4.71, page 84, the column space of AB is contained in the column space of A; hence rank (AB) — rank (A). 5.42. Let A be an n-square matrix. Show that A is invertible if and only if rank (A) = n. Note that the rows of the w-square identity matrix /„ are linearly independent since /„ is in echelon form; hence rank (/„) = n. Now if A is invertible then, by Problem 3.36, page 57, A is row equivalent to /„; hence rank (A) = n. But if A is not invertible then A is row equivalent to a matrix with a zero row; hence rank (A) < n. That is, A is invertible if and only if rank (A) = n. 5.43. Let JCij, Xi^, . . . , a;i^ be the free variables of a homogeneous system of linear equations with n unknowns. Let Vj be the solution for which: x^ — 1, and all other free varia- bles = 0. Show that the solutions Vi, V2, . . .,Vk are linearly independent. Let A be the matrix whose rows are the Vi respectively. We interchange column 1 and column ii, then column 2 and column ^2, . . ., and then column k and column i^l and obtain the kXn matrix 1 . . "l.k + l ■ ■ Ci„ 1 . . ''2, k + 1 • C2„ = (/, C) \o ... 1 Cfc,fc + i ... c„„/ The above matrix B is in echelon form and so its rows are independent; hence rank (B) = k. Since A and B are column equivalent, they have the same rank, i.e. rank (A) = k. But A has k rows; hence these rows, i.e. the iij, are linearly independent as claimed. MISCELLANEOUS PROBLEMS 5.44. The concept of linear dependence is extended to every set of vectors, finite or infinite, as follows: the set of vectors A = {vi} is linearly dependent iflf there exist vectors Vi^, . . .,Vi^G A and scalars ai, . . . , a„ S Z, not all of them 0, such that aiVi^ + aiVi^ + • ■ • + a„i7i„ = Otherwise A is said to be linearly independent. Suppose that Ai, A2, . . . are linearly independent sets of vectors, and that AiCAzC---. Show that the union A = A1UA2U • • • is also linearly independent. Suppose A is linearly dependent. Then there exist vectors v^ v^& A and scalars «!,..., o„ G K, not all of them 0, such that a^Vi + a2V2 + • • • + a^Vn = (1) Since A = uAj and the ■Wj S A, there exist sets A,, . . ., A;^ such that v^eAi^, v^eAi^, ..., v„eAi^ Let k be the maximum index of the sets Aj.: k = max (ij, . . . , i„). It follows then, since Ai C Ag c • ■ • , that each Aj. is contained in A^. Hence •Ui,'y2, • ■ -.-yn 6 -^k and so, by {1), A^ is linearly dependent, which contradicts our hypothesis. Thus A is linearly independent. CHAP. 5] BASIS AND DIMENSION 113 5.45. Consider a finite sequence of vectors S — {vuV2, . . .,v„}. Let T be the sequence of vectors obtained from S by one of the following "elementary operations": (i) inter- change two vectors, (ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S and T generate the same space W. Also show that T is independent if and only if S is independent. Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other hand, each operation has an inverse of the same type (Prove!); hence the vectors in S are linear combinations of vectors in T. Thus S and T generate the same space W. Also, T is inde- pendent if and only if dim W — n, and this is true iff S is also independent. 5.46. Let A — (ay) and B = (by) be row equivalent mXn matrices over a field K, and let vi, . . .,Vn be any vectors in a vector space V over K. Let Ml = ttiiVi + anVi + •••-!- ainVn Wi = 6iit;i -I- bi2V2 +•••-!- binVn Ui — a^iVl + a22V2 + • • • + a2nVn W2 = b2lVl + b22V2 -I- • • • -I- &2nVn Um = OmlVl + am2V2 + • • • "I" OmnVn Wm = bmlVl + bm2V2 + • • • + bmnVn Show that [Vd] and (w.} generate the same space. Applying an "elementary operation" of the preceding problem to {mJ is equivalent to applying an elementary row operation to the matrix A. Since A and B are row equivalent, B can be obtained from A by a sequence of elementary row operations; hence {tyj can be obtained from {mJ by the corresponding sequence of operations. Accordingly, {mj} and {wj generate the same space. 5.47. Let Vi, . . .,Vn belong to a vector space V over a field K. Let Wi = aiiVi -I- ai2V2 + • • ■ + ainVn W2 = a2iVi + a22V2 +•••-!- a2nVn Wn — dnlVl + an2V2 "I" ■ • • "I" annVn where Oij S K. Let P be the n-square matrix of coefficients, i.e. let P = {an). (i) Suppose P is invertible. Show that {wi) and {vi} generate the same space; hence {wi) is independent if and only if {vi) is independent. (ii) Suppose P is not invertible. Show that {wi) is dependent. (iii) Suppose {wi) is independent. Show that P is invertible. (i) Since P is invertible, it is row equivalent to the identity matrix /. Hence by the preceding problem {wj and {vi^ generate the same space. Thus one is independent if and only if the other is. (ii) Since P is not invertible, it is row equivalent to a matrix with a zero row. This means that {wit generates a space which has a generating set of less than n elements. Thus {wj is dependent. (iii) This is the contrapositive of the statement of (ii), and so it follows from (ii). 5.48. Suppose V is the direct sum of its subspaces U and W, i.e. V = II®W. Show that: (i) if (Ml, . . .,tfot}cC/ and {wi, . . . , Wn) CW are independent, then {Vn,Wj) is also independent; (ii) dim V = dim U + dim W. (i) Suppose a^Ui + • • • + (i„,m„, + 6iWi +•••-!- b„w„ = 0, where ttj, bj are scalars. Then = (ajMi -I- • ■ • -h ttrnWrn) + (^I'^l + • • • + b„Wn) = + 114 BASIS AND DIMENSION [CHAP. 5 where 0, a^u^ + • • • + a^Mm ^ U and 0, h^Wi + • • • + 6n'"'n ^ '^^ Since such a sum for is unique, this leads to axUi + • • • + a„Mm = 0, bjWi + • • • + h^w^^ — The independence of the tij implies that the Oj are all 0, and the independence of the Wj implies that the 6j- are all 0. Consequently, {mj, Wj} is independent. (ii) Method 1. Since V -U ®W, we have V =U+W and Ur\W - {0}. Thus, by Theorem 5.8, page 90, dimy = dimC7 + diml?F - dim(f/nW') = dim 1/ + dim W - = dim U + dim T^ Method 2. Suppose {u^, ...,u^) and {t«i, . . . , w^ are bases of U and W respectively. Since they generate TJ and W respectively, {mj, Wj) generates V = V +W. On the other hand, by (i), (mj, Wj) is independent. Thus {u^, w^ is a basis of V; hence dim V = dim ?7 + dim W. 5.49. Let [/ be a subspace of a vector space V of finite dimension. Show that there exists a subspace W oiV such that F = U®W. Let {mj, . . . , m,} be a basis of U. Since {mJ is linearly independent, it can be extended to a basis of V, say, {mi, . . . , m„ ^i wJ. Let W be the space generated by {w^, . . . , w,}. Since {Mj,Wi} generates V, V = V + W. On the other hand, C/ n H' = {0} (Problem 5.62). Accordingly, y = ?7 © T17. 5.50. Recall (page 65) that if X is a subfield of a field E (or: £? is an extension of K), then E may be viewed as a vector space over K. (i) Show that the complex field C is a vector space of dimension 2 over the real field R. (ii) Show that the real field R is a vector space of infinite dimension over the rational field Q. (i) We claim that {l,i} is a basis of C over R. For if v&C, then v = a + hi = a'l-\-b'i where a, 6 e K; that is, {1, i} generates C over R. Furthermore, if x'l + y'i = Q or x + yi - 0, where a;,j/ S R, then a; = and y = 0; that is, {l,i} is linearly independent over R. Thus {1, t} is a basis of C over R, and so C is of dimension 2 over R. (ii) We claim that, for any n, {1, Tr.tr^, . . ., tj-"} is linearly independent over Q. For suppose aol + ffliff + a^ifi + ■ • • + a„7r" = 0, where the aj G Q, and not all the aj are 0. Then ;r is a root of the following nonzero polynomial over Q: a^ + a^x + ajOi^ + • • • + o„x". But it can be shown that tt- is a transcendental number, i.e. that ir is not a root of any nonzero polynomial over Q. Accordingly, the w+1 real numbers 1, jr, jt-^, . . .,jr" are linearly independent over Q. Thus for any finite n, R cannot be of dimension n over Q, i.e. R is of infinite dimension over Q. 5.51. Let 2«: be a subfield of a field L and L a subfield of a field E: KgLcE. (Hence K is a subfield of E.) Suppose that E is of dimension n over L and L is of dimension m over K. Show that E is of dimension mn over K. Suppose {vi, . . .,i;„} is a basis of E over L and {a^, . . .,a™} is a basis of L over K. We claim that {aiVj-. i = l,...,m,j = l,...,n} is a basis of E over K. Note that {aji^j} contains mn elements. Let w be any arbitrary element in E. Since {i;i, ...,v„} generates E over L, w is a linear com- bination of the ■Uj with coefficients in L: W = 6i1)i + b2V2 + • • • + fen'Wn. &{ S ^ W Since {oi, . . . , a^) generates L over K, each 64 S L is a linear combination of the a,- with co- efficients in K: 61 = fciitti + fejafta + • • ■ + fcim^m bn - '^nl«l + '^n2«2 + ' " ' + Km"'n CHAP. 5] BASIS AND DIMENSION 115 •where k^ e K. Substituting in (1), we obtain w = (kiitti H + ki^ajvi + (fcaitti + • • • + A;2„o„,)v2 + • • • + (Kia^ + • • • + k„^ajv„ - ^^U"!"! + •• • • + ki^amVl + 'C21«l'"2 + " * ' + fe2m«m''2 + ' " " + k^iaiV^ + • • • + fc„„a„V„ i.} where fc^j e K. Thus w is a linear combination of the ajVj with coefficients in K; hence {ojVj} gen- erates E over X. The proof is complete if we show that {a^Vj} is linearly independent over K. Suppose, for scalars Xj, e X, 2 «n(aiVj) = 0; that is, {xiidiVi + Xi2a2Vi + • • • + Xi,„a„,Vi) + • • • + {x^iaiV^ + x^^ch.i'n + • ■ • + a;„ma„v„) = or (ajiitti + Kiattj + • • • + Xi,^a„)vi + • • ■ + (a;„ieii + x„2a2 + • • • + x„^a„)v„ = Since {vi, . . ., v„} is linearly independent over L and since the above coefficients of the Vj belong to L, each coefficient must be 0: Kiiffli + Xi2a2 + ■ • • + a;im«m = 0, . . ., x^^Ui + x„2a2 + • • • + »„„«„ = But {ffli, . . . , «„} is linearly independent over K; hence since the ar^i £ K, «ii = 0, Xi2 = »im = 0, ..., «„! = 0, a;„2 = 0, ..., a;„m = Accordingly, {ttjVj} is linearly independent over K and the theorem is proved. Supplementary Problems LINEAR DEPENDENCE 5.52. Determine whether u and v are linearly dependent where: (i) u = (1, 2, 3, 4), V = (4, 3, 2, 1) (iii) m = (0, 1), v = (0, -3) (ii) u = (-1, 6, -12), V = (|, -3, 6) (iv) u = (1, 0, 0), v = (0, 0, -3) (vii) u = -fi + |t2 - 16, V = ^f3 - :^t2 + 8 (viii) m = t^ + 3t + 4^ -y = t^ + 4t + 3 5.53. Determine whether the following vectors in R* are linearly dependent or independent: (i) (1, 3, —1, 4), (3,8,-5,7), (2,9,4,23); (ii) (1,-2,4,1), (2, 1,0,-3), (3, -6,1,4). 5.54. Let V be the vector space of 2 X 3 matrices over R. Determine whether the matrices A,B,C B V are linearly dependent or independent where: 1 -2 3\ „ _ /I -1 4\ /3 -8 7 4 5 -2 7 ' I 2 10 -1 <"'^=a4-:). -(4 I'D- ^=(r4-s 5.55. Let V be the vector space of polynomials of degree — 3 over R. Determine whether u,v,w GV are linearly dependent or independent where: (i) u = fS - 4*2 + 2t + 3, V = ts + 2*2 + 4t - 1, w - 2fi - fi - 3t + 5 (ii) u - t3 _ 5^2 _ 2t + 3, V - t^ -4t^ -3t + A, w = 2fi - 7fi - It + 9 116 BASIS AND DIMENSION [CHAP. 5 5.56. Let V be the vector space of functions from R into R. Show that f,g,h e V are linearly independ- ent where: (i) f{t) = e«, g(t) = sin t, h(t) = th (ii) /(*) = e«, g(t) = e^*, hit) = t; (iii) /(f) = e«, g{t) = sin t, h(t) = cos t. 5.57. Show that: (i) the vectors (1 — i, i) and (2, —1 + i) in C^ are linearly dependent over the complex field C but are linearly independent over the real field R; (ii) the vectors (3 -I-V2, 1 + \/2) and (7, 1 + 2'\/2 ) in R2 are linearly dependent over the real field R but are linearly independent over the rational field Q. 5.58. Suppose u, v and w are linearly independent vectors. Show that: (i) u + V — 2w, u — v — w and u + w are linearly independent; (ii) u + V — 3w, u + 3v — w and v + w are linearly dependent. 5.59. Prove or show a counterexample: If the nonzero vectors u, v and w are linearly dependent, then w is a linear combination of u and v. 5.60. Suppose Vi, v^, . . . , f „ are linearly independent vectors. Prove the following: (i) {oi'Wi, a2>'2< • • •> "n'^'n} is linearly independent where each eij # 0. (ii) {vi, . . .,v^-l,w,v^^l, . . ., v„} is linearly independent where w = b^v^ + • • • + b^Vi + • • • + b„v„ and 5i # 0. 5.61. Let V = (a, 6) and w — (c, d) belong to K^. Show that {v, w} is linearly dependent if and only if ad— be = 0. 5.62. Suppose {ui, . . . , m^, Wi, . . . , Wg} is a linearly independent subset of a vector space V. Show that L(Mi) n L{Wj) = {0}. (Recall that L(Mj) is the linear span, i.e. the space generated by the Mj.) 5.63. Suppose (a^, ..., ai„), . . . , (ttmi, . . . , a^^) a^'e linearly independent vectors in Z", and suppose ■Ui, . . . , v„ are linearly independent vectors in a vector space V over K. Show that the vectors Wi = au^'i + • • • + ffliB^n, . . ., Wm = a^i^i + • • • + «mn1'K are also linearly independent. BASIS AND DIMENSION 5.64. Determine whether or not each of the following forms a basis of R^: (i) (1, 1) and (3, 1) (iii) (0, 1) and (0, -3) (ii) (2, 1), (1, -1) and (0, 2) (iv) (2, 1) and (-3, 87) 5.65. Determine whether or not each of the following forms a basis of R^: (i) (1, 2, -1) and (0, 3, 1) (ii) (2, 4, -3), (0, 1, 1) and (0, 1, -1) (iii) (1, 5, -6), (2, 1, 8), (3, -1, 4) and (2, 1, 1) (iv) (1, 3, -4), (1, 4, -3) and (2, 3, -11) 5.66. Find a basis and the dimension of the subspace W of R* generated by: (i) (1, 4, -1, 3), (2, 1, -3, -1) and (0, 2, 1, -5) (ii) (1, -4, -2, 1), (1, -3, -1, 2) and (3, -8, -2, 7) 5.67. Let V be the space of 2 X 2 matrices over R and let W be the subspace generated by ui). (-: \y (-r:) - (4-;; Find a basis and the dimension of W. 5.68. Let W be the space generated by the polynomials tt = (3 + 2*2 - 2t + 1, t) = t* + 3*2 - « + 4 and w = 2*3 + t2 - 7t - 7 Find a basis and the dimension of W. CHAP. 5] BASIS AND DIMENSION 117 5.69. Find a basis and the dimension of the solution space W of each homogeneous system: X + Sy + 2z = X - 2y + 7z - X + 4:y + 2z = X + 5y + z - 2x + 3y - 2z = 2x+ y + 5z = 3x + 5y + 8z = 2x - y + z = (i) (li) (iii) 5.70. Find a basis and the dimension of the solution space W of each homogeneous system: X + 2y ~2z + 2s - t = x + 2y - z + 3s-4t = X + 2y - z + 3s - 2t = 2x + 4y - 2z - s + 5t = 2x + 4y ~ Iz + s + t = 2x + iy - 2z + 4:S -2t - (i) (ii) 5.71. Find a homogeneous system whose solution set W is generated by {(1, -2, 0, 3, -1), (2, -3, 2, 5, -3), (1, -2, 1, 2, -2)} 5.72. Let V and W be the following subspaces of R*: V = {(a,b,c,d): 6-2c + d = 0}, W = {{a,b,c,d): a = d, b = 2c} Find a basis and the dimension of (i) V, (ii) W, (iii) VnW. 5.73. Let V be the vector space of polynomials in t of degree — n. Determine whether or not each of the following is a basis of V: (i) {1, 1 + t, l + t+t2, l+t+t2 + t3, ..., l + t+t2+ ••• +t"-l + t«} (ii) {1 + t, t+ fi, t^ + t», ..., t"-2 + t"-i, t»-i + t"}. SUMS AND INTERSECTIONS 5.74. Suppose V and W are 2-dimensional subspaces of R3. Show that VnW ^ {0}. 5.75. Suppose U and W are subspaces of V and that dim U = 4, dim W = 5 and dim V - 7. Find the possible dimensions of U nW. 5.76. Let U and P7 be subspaces of R3 for which dim U - 1, dim W = 2 and UlW. Show that Rs = r; © w'. 5.77. Let U be the subspace of Rs generated by {(1, 3, -3, -1, -4), (1, 4, -1, -2, -2), (2, 9, 0, -5, -2)} and let W be the subspace generated by {(1, 6, 2, -2, 3), (2, 8, -1, -6, -5), (1, 3, -1, -5, -6)} Find (i) dim {U+W), (ii) dim (t/n VT). 5.78. Let V be the vector space of polynomials over R. Let U and W be the subspaces generated by {t3 + 4*2 - t + 3, t3 + 5*2 + 5^ 3*3 + 10(2 - 5t + 5} and {t^ + U^ + &,t^ + 2t^-t + 5, 2*3 + 2*2 - 3* + 9} respectively. Find (i) dim (f/ + W'), (ii) i\m.(VnW). 5.79. Let U be the subspace of RS generated by {(1, -1, -1, -2, 0), (1, -2, -2, 0, -3), (1, -1, -2, -2, 1)} and let W be the subspace generated by {(1, -2, -3, 0, -2), (1, -1, -3, 2, -4), (1, -1, -2, 2, -5)} (i) Find two homogeneous systems whose solution spaces are U and W, respectively, (ii) Find a basis and the dimension of U r\W. 118 BASIS AND DIMENSION [CHAP. 5 COORDINATE VECTORS 5.80. Consider the following basis of B^: {(2, 1), (1, -1)}. Find the coordinate vector of vSU^ relative to the above basis where: (i) i; = (2,3); (ii) v = (4,-1), (iii) (3,-3); (iv) v = (a,b). 5.81. In the vector space V of polynomials in t of degree - 3, consider the following basis: {1, 1 - t, (1 - t)^, (1 _ t)3}. Find the coordinate vector of v S y relative to the above basis if: (i) v = 2 - 3t + t* + 2t^; (ii) i; = 3 - 2t - ^2; (iii) v = a + bt + ct^ + dt^. 5 82 In the vector space PF of 2 X 2 symmetric matrices over R, consider the following basis: {(-11). a :)•(-' 1)} Find the coordinate vector of the matrix AGW relative to the above basis if: ... - = (4 -I) <"' ^<l I) 5.83. Consider the following two bases of R*: {ei = (1, 1, 1), 02 = (0, 2, 3), 63 = (0, 2, -1)} and {/i = (1, 1, 0), /z = (1, -1. 0), fs = (0, 0, 1)} (i) Find the coordinate vector ot v = (3,5,-2) relative to each basis: [v]^ and [v]y. (ii) Find the matrix P whose rows are respectively the coordinate vectors of the e, relative to the basis {/i, /a, /a)- (iii) Verify that [v]eP=[v]f. 5.84. Suppose {e^, . . .,e„} and {fu ••../„} are bases of a vector space V (of dimension n). Let P be the matrix whose rows are respectively the coordinate vectors of the e's relative to the basis {fih Prove that for any vector veV, [v]^P = [v]f. (This result is proved in Problem 5.39 in the case n - 3.) 5.85. Show that the coordinate vector of S F relative to any basis of V is always the zero w-tuple (0,0, ...,0). RANK OF A MATRIX 5.86. Find the rank of each matrix: 5.87. Let A and B be arbitrary mXn matrices. Show that rank (A + B) ^ rank (A) + rank (B). 5.88. Give examples of 2 X 2 matrices A and B such that: (i) rank (A + B)< rank (A), rank {B) (ii) rank (A + B) = rank (A) = rank (B) (iii) rank (A + B) > rank (A), rank (B) MISCELLANEOUS PROBLEMS 5.89. Let W be the vector space of 3 X 3 symmetric matrices over K. Show that dimW = 6 by ex- hibiting a basis of W. (Recall that A = (ay) is symmetric iff Oy = a^;.) 5.90. Let W be the vector space of 3 X 3 antisymmetric matrices over K. Show that dim II' = 3 by exhibiting a basis of W. (Recall that A = (a„) is antisymmetric iff «« - -a^j.) 5.91. Suppose dim y = n. Show that a generating set with w elements is a basis. (Compare with Theorem 5.6(iii), page 89). CHAP. 5] BASIS AND DIMENSION 119 5.92. Let tj, «2, • ■ • . <B be symbols, and let K be any field. Let V be the set of expressions ajti + 03*2 + • • • + a„t„ where aj G K. Define addition in V by (ajfj + 02*2 + • • • + a„tj + (biti + 62*2 + • • • + 6„«„) = (Oi + 6i)ti + (02 + 62)«2 + • • • + K + «>„)«« Define scalar multiplication on V by fc(«i*i + «2*2 + • • • + o„<„) = ka^ti + ka^t^ + • • • + fea„t„ Show that y is a vector space over K with the above operations. Also show that {t^, . . . , t„} is a basis of y where, for t = 1, . . . , n, ti = Oti + • • • + Oti_i + l«i + Ofj + i + ■ • • + ot„ 5.93. Let y be a vector space of dimension n over a field K, and let K he a, vector space of dimension m over a subfield F. (Hence V may also be viewed as a vector space over the subfield F.) Prove that the dimension of V over F is ww. 5.94. Let U and W be vector spaces over the same field K, and let V be the external direct sum of U and TF (see Problem 4.45). Let t7 and Jy be the subspaces of V defined by C7 = {(m, 0) : m G 17} and W = {(0, w): w G W). (i) Show that U is isomorphic to U under the correspondence u <r> (m, 0), and that W is iso- morphic to W under the correspondence w ■*-> (0, w). (ii) Show that dim V = dim V + dim W. 5.95. Suppose V = U ® W. Let V be the external direct product of U and W. Show that V is isomorphic to y under the correspondence v = u + w <r^ {u,w). Answers to Supplementary Problems 5.52. (i) no, (ii) yes, (iii) yes, (iv) no, (v) yes, (vi) no, (vii) yes, (viii) no. 5.53. (i) dependent, (ii) independent. 5.54. (i) dependent, (ii) independent. 5.55. (i) independent, (ii) dependent. 5.57. (i) (2,-l + t) = (l + i)(l-t,i); (ii) (7, l + 2\/2) = (3-v^)(3 + -\/2, 1 + -/2). 5.59. The statement is false. Counterexample: u = (1, 0), v — (2, 0) and w = (1, 1) in R*. Lemma 5.2 requires that one of the nonzero vectors u,v,w is a linear combination of the preceding ones. In this case, v = 2m. 5.64. (i) yes, (ii) no, (iii) no, (iv) yes. 5.65. (i) no, (ii) yes, (iii) no, (iv) no. 5.66. (i) dim W = 3, (ii) dim W = 2. 5.67. dim W = 2. 5.68. dim Ty = 2. 5.69. (i) basis, {(7, -1, -2)}; dim ly = 1. (ii) dim ly = 0. (iii) basis, {(18, -1, -7)}; dim W = 1. 5.70. (i) basis, {(2,-1,0,0,0), (4,0,1,-1,0), (3,0,1,0,1)}; dim Ty = 3. (ii) basis, {(2, -1, 0, 0, 0), (1, 0, 1, 0, 0)}; dim ly = 2. 120 BASIS AND DIMENSION [CHAP. 5 5.71. 5.72. { 5x + y — z — s = X + y — z — t = 5.73. 5.75. 5.77. 5.78. 5.79. (i) (i) basis, {(1, 0, 0, 0), (0, 2, 1, 0), (0, -1, 0, 1)}; dim V = 3. (ii) basis, {(1,0,0,1), (0,2,1,0)}; dim Tf = 2. (iii) basis, {(0,2,1,0)}; dim (VnH^) = 1. Hint. VnW must satisfy all three conditions on a,b,c and d. (i) yes, (ii) no. For dim V = n + 1, but the set contains only n elements. dimd/nW') = 2, 3 or 4. dim {U+W) = 3, dim (UnW)^ 2. dim (U+W) = 3, dim (t/n W) = 1. - t = [30: + \4x + — z 2y + s =0 (ii) {(1, -2, -5, 0, 0), (0, 0, 1, 0, -1)}. dim {VnW) = 2 \Ax + 2y X^x + 2j/ + -8 = 2 + t = 5.80. (i) M = (5/3, -4/3), (ii) [v] = (1, 2), (iii) M = (0, 3), (iv) [v] = ((a + 6)/3, (a-26)/3). 5.81. (i) M = (2, -5, 7, -2), (ii) [v] = (0, 4, -1, 0), (iii) [v] = (a + h + c + d, -b-2c-3d, c + 3d, -d). 5.82. (i) [A] = (2, -1, 1), (ii) [A] = (3, 1, -2). /I 1\ 5.83. (i) [v], = (3, -1, 2), M; = (4, -1, -2); (ii) P = 1 -1 3 . 5.86. (i) 3, (ii) 2, (iii) 3, (iv) 2. 5.88. (i) A (ii) A (J P' (o o)' B CI -I) c :) fl o\ /o 1 o\ /o l\ 5.89. <;io 0,1 0,0 \0 0/ \o 0/ \l 0/ <""^=a i> "h: °) 0\ /O 0' 1 0,0 1 |, ,0 0/ \0 1 0; 5.90. l\ /O 0^ 0,0 1 -1 0/ \o -1 o; 5.93. Hint. The proof is identical to that given in Problem 5.48, page 113, for a special case (when V is an extension field of K). chapter 6 Linear Mappings MAPPINGS Let A and 5 be arbitrary sets. Suppose to each aGA there is assigned a unique ele- ment of B; the collection, /, of such assignments is called a function or mapping (or- map) from A into B, and is written f:A-^B or A^B We write f{a), read "/ of a", for the element of B that / assigns to a e A; it is called the value of fata or the image of a under /. If A' is any subset of A, then /(A') denotes the set of images of elements of A'; and if B' is any subset of B, then f-'{B') denotes the set of elements of A each of whose image lies in B': f{A') = {/(a) : aGA'} and f'W) = {aGA: f{a) G B'} We call f{A') the imxtge of A' and /-^(jB') the inverse image or preimage of B'. In particular, the set of all images, i.e. f{A), is called the image (or: ranflre) of /. Furthermore, A is called the domain of the mapping f:A^B, and B is called its co-domain. To each mapping f:A^B there corresponds the subset of A x B given by {{a, f{a)) : aGA}. We call this set the graph of /. Two mappings f:A-*B and g:A-^B are defined to be equal, written f = g, if /(a) = 5r(a) for every aGA, that is, if they have the same graph. Thus we do not distinguish between a function and its graph. The nega- tion of f = g is written f ¥- g and is the statement: there exists an aGA for which f(a) ^ g{a). Example 6.1 : Let A = {a, b, e, d} and B f from A into B: {x, y, z, w}. The following diagram defines a mapping Here f(a) = y, /(6) = x, f{c) = z, and f(d) = y. Also, faa,b,d}) = {f (a), fib), fid)} = {y,x,y} = {x,y} The image (or: range) of / is the set [x,y,z}: f(A) = {x,y,z}. Example 6.2: Let /:R -» R be the mapping which assigns to each real number x its square a;^: X V^ x^ or f{x) = a;2 Here the image of —3 is 9 so we may write /(— 3) = 9. 121 122 LINEAR MAPPINGS [CHAP. 6 We use the arrow i^ to denote the image of an arbitrary element x eA under a mapping /:A->5 bywriting x ^ fix) as illustrated in the preceding example. Example 6.3: Consider the 2 X 3 matrix A = c If we write the vectors in R^ and '1 -3 5' ,2 4 -1, R2 as column vectors, then A determines the mappmg T : RS -> R2 defined by V \-* Av, that is, T{v) - Av, v& R3 3\ /I -3 5\/ ^^ 1 , then T{v) = Av = ( 2 4 -_i | ^ -2/ ^ \-2/ Thus if V = < -10 12 Remark: Every mxn matrix A over a field K determines the mapping T : X" -» K™ defined by v v^ Av where the vectors in if" and li:"' are written as column vectors. For convenience we shall usually denote the above mapping by A, the same symbol used for the matrix. Example 6.4: Example 6.5: Example 6.6: Let V be the vector space of polynomials in the variable t over the real field R. Then the derivative defines a mapping D:V-^V where, for any polynomial / G V, we let D(f) = df/dt. For example, D(3t^ - 5t + 2) = 6t - 5. Let V be the vector space of polynomials in t over R (as in the preceding example). Then the integral from, say, to 1 defines a mapping ^ : V -* R where, for any polynomial f&V, we let ^(/) = f f(t) dt. For example, ^(3(2-5* + 2) = r (3t^-5t + 2)dt = i Note that this map is from the vector space V into the scalar field R, whereas the map in the preceding example is from V into itself. Consider two mappings f:A^B and g.B^C illustrated below: 0. -©— ^-© Let a G A; then /(a) G B, the domain of g. Hence we can obtain the image of /(a) under the mapping g, that is, g(f{a))- This map a K g{f(a)) from A into C is called the composition or product of / and g, and is denoted by g°f. In other words, (g°f):A-^C is the mapping defined by {9°f){a) = g(f(o)) Our first theorem tells us that composition of mappings satisfies the associative law. Theorem 6.1: Let f.A^B, g.B^C and h.C^D. Then ho{gof) = (hog)of. We prove this theorem now. If a G A, then {ho{gof)){a) = h{igof){a)) = h{g{f{a))) and ({h°g)of){a) = {hog){f{a)) = Hg{f{a))) Thus iho{gof)){a) = {ihog)of){a) for every a G A, and so ho{gof) = {hog)of. Remark: Let F:A-^B. Some texts write aF instead of F{(i) for the image of a G A under F. With this notation, the composition of functions F:A-*B and G-.B^C is denoted by F o G and not by G o F as used in this text. CHAP. 6] LINEAR MAPPINGS 123 We next introduce some special types of mappings. Definition: A mapping f:A-*B is said to be one-to-one (or one-one or 1-1) or injective if different elements of A have distinct images; that is, or, equivalently. if a v^ a' implies /(a) ¥- f{a') if /(a) = f{a') implies a = a' Definition: A mapping f:A-^B is said to be onto (or: / maps A onto B) or surjective if every b G B is the image of at least one a G A. A mapping which is both one-one and onto is said to be bijective. Example 6.7: Let /:R^R, g:B,-*R and fc : R -» B be defined by f(x) - 2'', g{x) — x^ - x and h(x) = x^. The graphs of these mappings follow: f(x) = 2=^ g(x) = X' h(x) = a;2 The mapping / is one-one; geometrically, this means that each horizontal line does not contain more than one point of /. The mapping g is onto; geometrically, this means that each horizontal line contains at least one point of g. The mapping h is neither one-one nor onto; for example, 2 and —2 have the same image 4, and —16 is not the image of any element of R. Example 6.8: Let A be any set. The mapping /:A-»A defined by f{a) = a, i.e. which assigns to each element in A itself, is called the identity mapping on A and is denoted by 1^ or 1 or /. Example 6.9: Let f:A-*B. We call g-.B^A the inverse of /, written /~i, if f°g = Ifl and g°f = 1a We emphasize that / has an inverse if and only if / is both one-to-one and onto (Problem 6.9). Also, if 6GB then /"'(ft) = a where a is the unique element of A for which f(a) = 6. LINEAR MAPPINGS Let V and U be vector spaces over the same field K. A mapping F:V -* U is called a linear mapping (or linear transformation or vector space komomorphism) if it satisfies the following two conditions: (1) For any v,wGV, F{v + w) = F{v) + F{w). (2) For any kGK and any vGV, F{kv) = kF{v). In other words, F:V^ U is linear if it "preserves" the two basic operations of a vector space, that of vector addition and that of scalar multiplication. Substituting k = into (2) we obtain F{0) = 0. That is, every linear mapping takes the zero vector into the zero vector. 124 LINEAR MAPPINGS [CHAP. 6 Now for any scalars a,b gK and any vectors v,w eV we obtain, by applying both conditions of linearity, F{av + bw) = F{av) + F{bw) = aF{v) + bF{w) More generally, for any scalars aiGK and any vectors viGV we obtain the basic property of linear mappings: F{aiVi + aiVi + • • • + anV„) = aiF{vi) + a^FiVi) + • • • + anF(Vn) We remark that the condition F{av + hw) = aF{v) + bF{w) completely characterizes linear mappings and is sometimes used as its definition. Example 6.10: Let A be any m X % matrix over a field K. As noted previously, A determines a mapping T:&^K^ by the assignment v y-> Av. (Here the vectors in K" and K"" are written as columns.) We claim that T is linear. For, by properties of matrices, T(v + w) - A(v + w) = Av + Aw = T{v) + T{w) and T(kv) = A(kv) = kAv = kT{v) where v,w G K^ and kE. K. We comment that the above type of linear mapping shall occur again and again. In fact, in the next chapter we show that every linear mapping from one finite-dimensional vector space into another can be represented as a linear mapping of the above type. Example 611- Let i^ : R^ -^ B» be the "projection" mapping into the xy plane: F(x, y, z) = {x, y, 0). We show that F is linear. Let v = (a, b, c) and w = {a', b', c'). Then F{v + w) = F{a + a',b + 6', c + c') = {a + a', b + 6', 0) = (o, b, 0) + (a', b', 0) = F(v) + F(w) and, for any fc € R, F(kv) = F{,ka, kb, ke) = (ka, kb, 0) = k(a, b, 0) = kF(v) That is, F is linear. Example 612- Let F : R2 ^ R2 be the "translation" mapping defined by F(x,y) = (x + l,y + 2) Observe that F(0) = F(0,0) = (1,2) ^ 0. That is, the zero vector is not mapped onto the zero vector. Hence F is not linear. Example 6.13: Let F.V^U be the mapping which assigns G [7 to every veV. Then, for any v,weV and any kG K, we have F{v + w) = = + = F(v) + F(w) and F(kv) = = fcO = kF(v) Thus F is linear. We call F the zero mapping and shall usually denote it by 0. Example 6.14: Consider the identity mapping I.V^V which maps each vGV into itself . Then, for any v,wGV and any a,beK, we have /(av + bw) = av + bw = al{v) + bl(w) Thus / is linear. Example 6.15: Let V be the vector space of polynomials in the variable t over the real field R Then the differential mapping D:V-^V and the mtegral mappmg JtV^B defined in Examples 6.4 and 6.5 are linear. For it is proven in calculus that for any u,v SV and fe e R, d{u + v) du , dv djku) j^du dt- = dt+dl ^""^ dt "dt that is, D(u + v)=D(u) + D(v) and D(ku) = k D(u); and also, f {u(t) + v(t)) dt = \ u(t) dt + \ v(t) dt and f ku{t)dt = k \ u{t) dt that is, J{u + •») = SM + SM and ^(few) = k^iu). CHAP. 6] LINEAR MAPPINGS 125 Example 6.16: Let F:V ^ U be a linear mapping which is both one-one and onto. Then an inverse mapping F"! : C/ -> y exists. We will show (Problem 6.17) that this inverse map- ping is also linear. When we investigated the coordinates of a vector relative to a basis, we also introduced the notion of two spaces being isomorphic. We now give a formal definition. Definition: A linear mapping F:V^U is called an isomorphism if it is one-to-one. The vector spaces V, U are said to be isomorphic if there is an isomorphism of V onto U. Example 6.17: Let y be a vector space over K of dimension n and let {e^, . . .,e„} be a basis of V. Then as noted previously the mapping v K [v]^, i.e. which maps each v eV into its coordinate vector relative to the basis {e}, is an isomorphism of V onto K". Our next theorem gives us an abundance of examples of linear mappings; in particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis. Theorem 6.2: Let V and U be vector spaces over a field K. Let {vi,'i;2, . . .,Vn} be a basis of V and let Ui, Ui, . . .,Un be any vectors in V. Then there exists a unique linear mapping F:V-^U such that F{vi) = Ui, F{v2) = 112, ..., F{vn) = th. We emphasize that the vectors Mi, . . . , zt„ in the preceding theorem are completely ar- bitrary; they may be linearly dependent or they may even be equal to each other. KERNEL AND IMAGE OF A LINEAR MAPPING We begin by defining two concepts. Definition: Let F:V-^U be a linear mapping. The image of F, written Im F, is the set of image points in U: Im F = {uGU : F{v) = u for some v G V} The kernel of F, written KerF, is the set of elements in V which map into OGU: Ker F = {vGV: F{v) = 0} The following theorem is easily proven (Problem 6.22). Theorem 6.3: Let F:V->U be a linear mapping. Then the image of /^ is a subspace of U and the kernel of /?" is a subspace of V. Example 6.18: Let F:R3-^H^ be the projection map- ping into the xy plane: F(x, y, z) — (x, y, 0). Clearly the image of F is the entire xy plane: | l || l ||llill • ■» = (a, 6, c) Im F = {(a, 6, 0) : o, b G R} Note that the kernel of F is the z axis: KerF = {(0, 0, c): c G R} since these points and only these points map into the zero vector = (0, 0, 0). 128 LINEAR MAPPINGS [CHAP. 6 Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear equations AX = is n-r where n is the number of unknowns and r is the rank of the coefficient matrix A. OPERATIONS WITH LINEAR MAPPINGS We are able to combine linear mappings in various ways to obtain new linear mappings. These operations are very important and shall be used throughout the text. Suppose F:V-*U and G:V-^U are linear mappings of vector spaces over a field K. We define the sum F + G to he the mapping from V into U which assigns F(v) + G{v) to ^^^' (F + G){v) = F{v) + Giv) Furthermore, for any scalar kGK, we define the product kF to be the mapping from V into U which assigns k F{v) to i; e F: ikF)iv) = kF{v) We show that if F and G are linear, then i^^ + G and kF are also linear. We have, for any vectors v,w GV and any scalars a,h GK, {F + G){av + bw) = F{av + bw) + Giav + bw) = aF{v) + bF{w) + aG(v) + bG{w) = a{Fiv) + G{v)) + b(F{w) + G{w)) = a(F + G){v) + b{F + G){w) and {kF)(av + bw) = kF{av + bw) = k{aF{v) + bF{w)) = akF{v) + bkF(w) = a(kF)(v) + b(kF){w) Thus F + G and kF are linear. The following theorem applies. Theorem 6.6: Let V and U be vector spaces over a field K. Then the collection of all linear mappings from V into U with the above operations of addition and scalar multiplication form a vector space over K. The space in the above theorem is usually denoted by Hom(7, [/) Here Hom comes from the word homomorphism. In the case that V and U are of finite dimension, we have the following theorem. Theorem 6.7: Suppose dim 7 = m and dim U ^ n. Then dim Hom(V, U) = mn. Now suppose that V, U and W are vector spaces over the same field K, and that F:V-*U and G:U^W are linear mappings: Recall that the composition function Goi?' is the mapping from V into W defined by {GoF){v) = G{Fiv)). We show that GoF is linear whenever F and G are linear. We have, for any vectors v,w gV and any scalars a,b GK, {GoF)iav + bw) = G{Fiav + bw)) = G{aF{v) + bFiw)) = aG{F{v)) + bGiF{w)) = aiGoF){v) + b{GoF)(w) That is, G o F is linear. CHAP. 6] LINEAR MAPPINGS 129 The composition of linear mappings and that of addition and scalar multiplication are related as follows: Theorem 6.8: Let V, U and W be vector spaces over K. Let F, F' be linear mappings from V into U and G, G' linear mappings from U into W, and let k&K. Then: (i) Go(F + F') = GoF + GoF' (ii) (G + G')oF = GoF + G'oF (iii) k{GoF) = {kG)oF = Go(kF). ALGEBRA OF LINEAR OPERATORS Let F be a vector space over a field K. We novir consider the special case of linear map- pings T:V^V, i.e. from V into itself. They are also called linear operators or linear transformations on V. We will write AiV), instead of Horn (V, V), for the space of all such mappings. By Theorem 6.6, A{V) is a vector space over K; it is of dimension n^ if V is of dimension n. Now if T,SgA{V), then the composition SoT exists and is also a linear mapping from V into itself, i.e. SoTgA(V). Thus we have a "multiplication" defined in A{V). (We shall write ST for SoT in the space A{V).) We remark that an algebra A over a field X is a vector space over K in which an opera- tion of multiplication is defined satisfying, for every F.G,H GA and every kGK, (i) F{G + H) = FG + FH (ii) (G + H)F = GF + HF (iii) k{GF) = {kG)F = G(kF). If the associative law also holds for the multiplication, i.e. if for every F,G,H gA, (iv) {FG)H = F{GH) then the algebra A is said to be associative. Thus by Theorems 6.8 and 6.1, A{V) is an associative algebra over K with respect to composition of mappings; hence it is frequently called the algebra of linear operators on V. Observe that the identity mapping / : 7 -> F belongs to A{V). Also, for any T G A{V), we have TI - IT - T. We note that we can also form "powers" of T; we use the notation T^^ToT,T^ = ToToT, .... Furthermore, for any polynomial V(x) = tto + aix + a2X^ + • • ■ + a„x", aiGK we can form the operator p{T) defined by p{T) = aol + aiT + a^T^ + • • ■ + a„r" (For a scalar kGK, the operator kl is frequently denoted by simply k.) In particular, if V{T) = 0, the zero mapping, then T is said to be a zero of the polynomial p{x). Example 6.21: Let T : R3 ^ R3 be defined by T(x,y,z) = (0,x,y). Now if {a,b,c) is any element of R3, then: {T + I)(a, b, c) = (0, a, b) + (a, b, c) = (a,a+b,b + c) and T^(a, b, c) = T^0, a, b) = T{0, 0, a) = (0, 0, 0) Thus we see that rs = 0, the zero mapping from V into itself. In other words, T is a zero of the polynomial p{x) = v?. 130 LINEAR MAPPINGS [CHAP. 6 INVERTIBLE OPERATORS A linear operator T -.V^ V is said to be invertible if it has an inverse, i.e. if there exists r-i e A{V) such that TT-^ = T-^T = I. Now T is invertible if and only if it is one-one and onto. Thus in particular, if T is invertible then only gV can map into itself, i.e. T is nonsingular. On the other hand, suppose T is nonsingular, i.e. Ker T = {0}. Recall (page 127) that T is also one-one. More- over, assuming V has finite dimension, we have, by Theorem 6.4, dimF = dim(Imr) + dim (Ker T) = dim(Imr) + dim({0}) = dim (Im T) + = dim (Im T) Then ImT -V, i.e. the image of T is V; thus T is onto. Hence T is both one-one and onto and so is invertible. We have just proven Theorem 6.9: A linear operator T:V-*V on a vector space of finite dimension is in- vertible if and only if it is nonsingular. Example 6.22: Let T be the operator on R2 defined by T(x, y) = (y, 2x-y). The kernel of T is {(0, 0)}; hence T is nonsingular and, by the preceding theorem, invertible. We now find a formula for T-i. Suppose (s, t) is the image of {x, y) under T\ hence (a;, y) is the image of (s, «) under r-i; T(x,y) = (s,t) and T-'^(s, t) = (x, y). We have T(x, y) - (y, 2x — y) = (s, t) and so y = s, 2x-y = t Solving for x and y in terms of s and t, we obtain a; = is + li, 2/ = s. Thus T^' is given by the formula T~^(s, t) = (|s + it, s). The finiteness of the dimensionality of V in the preceding theorem is necessary as seen in the next example. Example 6.23: Let V be the vector space of polynomials over K, and let T be the operator on V defined by T(ao + ait-\ + ajn) = a^t + Uit^ + ■ • • + a„<» + » i.e. T increases the exponent of t in each term by 1. Now T is a linear mapping and is nonsingular. However, T is not onto and so is not invertible. We now give an important application of the above theorem to systems of linear equations over K. Consider a system with the same number of equations as unknowns, say n. We can represent this system by the matrix equation Ax = b (*) where A is an n-square matrix over K which we view as a linear operator on K". Suppose the matrix A is nonsingular, i.e. the matrix equation Ax = has only the zero solution. Then, by Theorem 6.9, the linear mapping A is one-to-one and onto. This means that the system (*) has a unique solution for any b G K". On the other hand, suppose the matrix A is singular, i.e. the matrix equation Ax = has a nonzero solution. Then the linear mapping A is not onto. This means that there exist b G K" for which (*) does not have a solution. Furthermore, if a solution exists it is not unique. Thus we have proven the following fundamental result: Theorem 6.10: Consider the following system of linear equations: anXi + ai2X2 4- • • • + amajn = bi a2lXl + a22X2 + • • • + a2nXn = b2 a„lXl + an2X2 + • • ■ + annXn = &« CHAP. 6] LINEAR MAPPINGS 131 (i) If the corresponding homogeneous system has only the zero solution, then the above system has a unique solution for any values of the bu (ii) If the corresponding homogeneous system has a nonzero solution, then: (i) there are values for the 6i for which the above system does not have a solution; (ii) whenever a solution of the above system exists, it is not unique. Solved Problems MAPPINGS 6.1. State whether or not each diagram defines a mapping from A = {a, b, c) into B= {x,y,z}. (i) (ii) (i) No. There is nothing assigned to the element 6 G A. (ii) No. Two elements, x and z, are assigned to c G A. (iii) Yes. v(iii) 6.2. Use a formula to define each of the following functions from R into R. (i) To each number let / assign its cube. (ii) To each number let g assign the number 5. (iii) To each positive number let h assign its square, and to each nonpositive number let h assign the number 6. Also, find the value of each function at 4, —2 and 0. (i) Since / assigns to any number z its cube a^, we can define / by /(») = a:^. Also: /(4) = 43 = 64, /(-2) = (-2)3 = -8, /(O) = 0^ = (ii) Since g assigns 5 to any number x, we can define g by g{x) = 5. Thus the value of g at each number 4, —2 and is 5: flr(4) = 5, fl-(-2) - 5, ff(0) = 5 (iii) Two different rules are used to define h as follows: 'a;2 if X > 6 if a; =s Since 4 > 0, h(4) = 42 = 16. On the other hand, -2, ^ and so ^-2) = 6, fe(0) = 6. h{x) = 132 LINEAR MAPPINGS [CHAP. 6 6^. Let A = {1,2,3,4,5} and let f:A-^A be the map- ping defined by the diagram on the right, (i) Find the image of /. (ii) Find the graph of /. (i) The image /(A) of the mapping / consists of all the points assigned to elements of A. Now only 2, 3 and 5 appear as the image of any elements of A; hence /(A) = {2,3,5}. (ii) The graph of / consists of the ordered pairs (a, /(a)), where a&A. Now /(I) = 3, /(2) = 5, /(3) = 5, /(4) = 2, /(5) = 3; hence the graph of / = {(1,3),(2,5), (3,5), (4,2),(5,3)} 6.4. Sketch the graph of (i) f{x) ^x^ + x-6, (ii) g{x) = x^-^x^-x + Z. Note that these are "polynomial functions". In each case set up a table of values for x and then find the corresponding values of f{x). Plot the points in a coordinate diagram and then draw a smooth continuous curve through the points. (i) X m -4 6 -3 -2 -4 -1 -6 -6 1 -4 2 3 6 (ii) X 9{x) -2 -15 -1 3 1 2 -3 3 4 15 6.5. Let the mappings f.A^B and g.B^C be defined by the diagram A f B g C (i) Find the composition mapping {gof):A-*C. (ii) Find the image of each map- ping: f,g and go f. (i) We use the definition of the composition mapping to compute: (gofXa) = g(f(a)) = g(y) = t ig°f)ib) = gim) = g(x) = s (9° me) = g(f(c)) = g{y) = t Observe that we arrive at the same answer if we "follow the arrows" in the diagram: a -* y -* t, b -* X -* s, e ^ y -* t CHAP. 6] LINEAR MAPPINGS I33 (ii) By the diagram, the image values under the mapping / are x and y, and the image values under g are r, s and t\ hence image of / = {x, y} and image oi g = {r, s, t} By (i), the image values under the composition mapping gof are t and s; hence image of 0°f = {s, *}• Note that the images of g and g°f are different. 6.6. Let the mappings / and g be defined by f{x) = 2x + 1 and g{x) = x^-2. (i) Find (sro/)(4) and (/osr)(4). (ii) Find formulas defining the composition mappings gof and fog. (i) /(4) = 2^4 + 1 = 9. Hence (ff °/)(4) = p(/(4)) = ^(9) = 92 - 2 = 79. g(A) = 42 - 2 = 14. Hence (/ofl')(4) = /(fir(4)) = /<14) = 2 • 14 + 1 = 29. (ii) Compute the formula for g o f as follows: (ff°f){x) = g(f{x)) = fif(2a; + l) = (2a; + 1)2 - 2 = 4a;2 + 4a; - 1 Observe that the same answer can be found by writing y = f(x) = 2a! + 1 and z = g(y) = j/2 - 2, and then eliminating y: z - y^-2 = {2x-¥ 1)^ -2 - 4x^ + 4a! - 1. (f°g)(«) ^ f{g(x)) = /(a;2-2) = 2(a;2-2) + 1 = 2a!2 - 3. Observe that fog^g°f. 6.7. Let the mappings f:A-^B,g:B-^C and h:C-*D be defined by the diagram A f B g c h D a Determine if each mapping (i) is one-one, (ii) is onto, (iii) has an inverse. (i) The mapping / : A -> B is one-one since each element of A has a different image. The mapping g.B->C is not one-one since x and z both map into the same element 4. The mapping h:C -* D is one-one. (ii) The mapping f -.A^B is not onto since « e B is not the image of any element of A. The mapping g:B-*C is onto since each element of C is the image of some element of B. The mapping h-.C^D is also onto. (iii) A mapping has an inverse if and only if it is both one-one and onto. Hence only h has an inverse. 6.8. Suppose f:A-*B and g.B^C; hence the composition mapping {gof):A^C exists. Prove the following, (i) If / and g are one-one, then gof is one-one. (ii) If / and g are onto, then gof is, onto, (iii) If gof is, one-one, then / is one-one. (iv) If gof is onto, then g is onto. (i) Suppose (g°f)(x) = (g°f){y). Then g(f{x)) = g{f(y)). Since g is one-one, f{x) = f(y). Since / is one-one, x = y. We have proven that {g ° f){x) = (g ° f){y) implies x = y\ hence flro/ is one-one. (ii) Suppose c&C. Since g is onto, there exists h e. B for which g(h) = c. Since / is onto, there exists ae.A for which f(a) - b. Thus (g '' f){a) = g(f(a)) = g(b) = c; hence flr o / is onto. (iii) Suppose / is not one-one. Then there exists distinct elements x,y G A for which /(a;) = f{y). Thus {g°f)(x) = g{f{x)) = g(f(y)) = (g ° f)(y); hence g°f is not one-one. Accordingly it g°f is one-one, then / must be one-one. (iv) If aGA, then (g ° f){a) = g(f{a)) G g{B); hence (g° f){A) C g(B). Suppose g is not onto. Then g(B) is properly contained in C and so (g ° f)(A) is properly contained in C; thus g°f is not onto. Accordingly Hgofia onto, then g must be onto. 134 LINEAR MAPPINGS [CHAP. 6 6.9. Prove that a mapping f:A-*B has an inverse if and only if it is one-to-one and onto. Suppose / has an inverse, i.e. there exists a function /-i : B -^ A for which f~^°f = 1a and /0/-1 = ig. Since 1a is one-to-one, / is one-to-one by Problem 6.8(iii); and since 1b is onto, / is onto by Problem 6.8(iv). That is, / is both one-to-one and onto. Now suppose / is both one-to-one and onto. Then each 6 S B is the image of a unique element in A, say 6. Thus if f{a) = b, then a = b; hence /(6) = b. Now let g denote the mapping from A. B to A defined by 6 1-^6. We have: (i) («? ° /)(<») - fir(/(o)) = S(b) = 6 = a, for every a G A; hence s ° / = 1a- (ii) {f°ff)(b) = f{g{b)) - fib) = 6, for every 6 G B; hence f°g -Ib- Accordingly, / has an inverse. Its inverse is the mapping g. 6.10. Let / : R ^ R be defined by fix) = 2x-S. Now / is one-to-one and onto; hence / has an inverse mapping /"^ Find a formula for f~^. Let y be the image of x under the mapping f: y - f(x) =2x-S. Consequently x will be the image of y under the inverse mapping f-K Thus solve for x in terms of y in the above equation: X = {y + 3)/2. Then the formula defining the inverse function is f-Hy) = {y + 3)/2. LINEAR MAPPINGS 6.11. Show that the following mappings F are linear: (i) F : R2 ^ R2 defined by F{x, y)=^{x + y, x). (ii) F:R^-*R defined by F{x,y,z) = 2x-Sy + 4z. (i) Let v = (a,b) and w = (a',b'); hence V + w - (a + a',b + b') and kv = {ka, kb), k&R We have F(v) = (a -t- 6, a) and F(w) = (a' + b', a'). Thus F(v + w) = F(a + a',h + b') - (a.-\- a.' + h+b' , a+ a') = (a + 6, a) + (a' + b', a') = F{v) + F(w) and F(kv) - F(ka, kb) := (ka + kb, ka) = k(a + b,a) = kF{v) Since v, w and A; were arbitrary, F is linear. (ii) Let v-(a,b,c) and w = (a',b',c'); hence V + w = (a + a',b + b',c + e') and kv - (ka, kb, kc), fc e E We have F(v) = 2a - 36 -I- 4c and F(w) = 2a' - 36' -t- 4c'. Thus F(v + w) = F(a + a',b + b',c + c') = 2(a -1- a') - 3(6 + 6') -h 4(c -I- c') = (2a - 36 + 4c) + (2a' - 36' 4- 4c') = F(v) + F(w) and F(kv) = F(ka, kb, kc) = 2ka - 3kb + 4kc = k(2a - 36 + 4c) = kF(v) Accordingly, F is linear. 6.12. Show that the following mappings F are not linear: (i) F : R2 ^ R defined by F(x, y) = xy. (ii) FrB?^B? defined by F{x, y) = {x + 1, 2y, x + y). (iii) F:W-^B? defined by F{x, y, z) = (\x\, 0). (i) Let ■w = (l,2) and w = (3,4); then v + w = (A,6). We have F(v) = 1*2 = 2 and F(w) = 3 • 4 = 12. Hence CHAP. 6] LINEAR MAPPINGS 135 F(v + w) = F(4, 6) = 4 • 6 = 24 ^ F{v) + F{w) Accordingly, F is not linear. (ii) Since F{0, 0) = (1, 0, 0) ^ (0, 0, 0), F cannot be linear. (iii) Let v = (1, 2, 3) and k = -3; hence kv = (-3, -6, -9). We have F{v) = {1,0) and so kF (v) = -S(1,0) = {-S,0). Then Fikv) = F{-3, -6, -9) = (3, 0) # fcF('y) and hence F is not linear. 6.13. Let V be the vector space of n-square matrices over K. Let M be an arbitrary matrix in F. Let r : F -* 7 be defined by T{A) = AM + MA, where AeV. Show that T is linear. For any A,BGV and any k G K, we have T{A +B) = (A + B)M + M{A + B) = AM + BM + MA + MB = {AM + MA) + {BM + MB) = T{A) + T{B) and T{kA) = {kA)M + M{kA) = k{AM) + k{MA) = k{AM + MA) ^ kT{A) Accordingly, T is linear. 6.14. Prove Theorem 6.2: Let V and U be vector spaces over a field K. Let {^'i, ...,■?;„} be a basis of V and let Mi, . . . , «„ be any arbitrary vectors in U. Then there exists a unique linear mapping F:V-^U such that F{Vi) = Ui, F{v2) = %2, . . ., F{Vn) = Un. There are three steps to the proof of the theorem: (1) Define a mapping F .V ^ U such that F{v^ = Mi, i = l, ...,n. (2) Show that F is linear. (3) Show that F is unique. Step {1). Let V eV. Since {v^, . . .,i;„} is a basis of V, there exist unique scalars a^, . . .,a„ G X for which v = a^v^ + a^v^ -] + a„i;„. We define F:V ^ U by F{v) = a^Ui + a^u^ -\ h a„M„. (Since the osj are unique, the mapping F is well-defined.) Now, for i= \, ..., n, Vi - Ovi + ■ ■ ■ + Ivi + • ■ ■ + Ov„ Hence F{Vi) = Omj + • • • + Imj + • • ■ + Om„ = m; Thus the first step of the proof is complete. Step {2). Suppose v = a^Vi + a^v^ + • • • + a„'U„ and w = b^Vi + h^Vi + • • • + h^v^. Then V + w = (tti + hi)Vi + {a2 + b2)v2 + ■ • • + (a„ + 6„)v„ and, for any kG K, kv = ka^v^ + kazVz + • • ■ + ka^v^. By definition of the mapping F, F{v) = OiMi + a^u^ + • ■ • + a„M„ and F{w) = h^u^ + h^Vi + • ■ • + 6„m„ Hence F{v + w) = {a^ + h-i)ui + {a^ + 62)^2 +•••+(«„ + 6„)m„ = («!«, + ajMj + • • • + a„M„) + (6iMi + 62M2 + • • • + 6„M„) = F{v) + F(w) and f (fci;) = fc(oiMi + O2M2 + • • • + o„mJ = ^^(1;) Thus f is linear. Step (3). Now suppose G:V ^V is linear and G{v^ = M;, t = 1, . . ., m. If v = Oi^i + a^v^ + 1- a^nVn, then G(t)) = G(aiVi + a^v^ + • • • + a„v„) - a^G{v-^ + a2G{v^ + • • • + a„G(v„) = OiMj + a2M2 + • • • + «„«*„ = ii'(t)) Since G(t)) = F{v) for every i? G V, G = F. Thus F is unique and the theorem is proved. 136 LINEAR MAPPINGS [CHAP. 6 6.15. Let r : R2 -» R be the linear mapping for which r(l, 1) = 3 and r(0, 1) = -2 {1) (Since {(1,1), (0,1)} is a basis of R^, such a linear mapping exists and is unique by Theorem 6.2.) Find T{a, b). First we write (a, 6) as a linear combination of (1, 1) and (0, 1) using unknown scalars x and y: {a, b) = x{l, 1) + j/(0, 1) (2) Then (a, 6) = (x, x) + (0, y) = (x,x + y) and so x-a, x + y=^b Solving for x and y in terms of a and b, we obtain X = a and y = b — a (3) Now using (1) and (2) we have T(a, b) = T{x(l, 1) + 3/(0, D) = xT{l, 1) + yT(0, 1) = 3x - 2y Finally, using (3) we have T(a, b) - 3x - 2y - 3(a) - 2(6 - a) = 5a - 26. 6.16. Let T -.V^U be linear, and suppose Vi, ...,Vn&V have the property that their images T{vi), .... T{vn) are linearly independent. Show that the vectors vi, . . .,Vn are also linearly independent. Suppose that, for scalars ai, . . . , o„, a^v^ + ag^a + • • • + a„Vn = 0. Then = r(0) = Tia^vi + a.2V2 + h a„i'„) = aiT(vi) + azTiv^) + ■ ■ • + a„r(v„) Since the T{Vi) are linearly independent, all the O; = 0. Thus the vectors Vi,...,v„ are linearly independent. 6.17. Suppose the linear mapping F:V^U is one-to-one and onto. Show that the inverse mapping F~^:U -*V is also linear. Suppose M.w'G U. Since F is one-to-one and onto, there exist unique vectors v,v'BV for which F{v) = u and F(v') = u'. Since F is linear, we also have F{v + v') - F{v) + F(v') = u + u' and F{kv) = kF(v) - ku By definition of the inverse mapping, F-i(tt) = v. F-i(m') = f', F-i-{u-\-u') = v + v' and F-^ku) = fc'U. Then F-Mm + m') = •« + •"' = F-i(m) + F-Mm') and F-HM = fei) = feF-Htt) and thus F"' is linear. IMAGE AND KERNEL OF LINEAR MAPPINGS 6.18. Let F : R* -» R5 be the linear mapping defined by F{x,y,s,t) = {x-y + s + t,x + 2s-t,x + v + ?>s-Zt) Find a basis and the dimension of the (i) image U of F, (ii) kernel W of F. (i) The images of the following generators of R* generate the image V of F: F(l, 0,0,0) - (1,1,1) F(0, 0,1,0) = (1,2,3) F(0, 1, 0, 0) = (-1, 0, 1) F(0, 0, 0, 1) = (1, -1, -3) Form the matrix whose rows are the generators of U and row reduce to echelon form: to „ . « to Thus {(1, 1, 1), (0, 1, 2)} is a basis of V; hence dim C/ = 2. CHAP. 6] LINEAR MAPPINGS 137 (ii) We seek the set of (x, y, s, t) such that F{x, y, s, t) = (0, 0, 0), i.e., F(x, y,s,t) = (x-y + s + t,x + 2s-t,x + y + Bs-St) = (0, 0, 0) Set corresponding components equal to each other to form the following homogeneous system whose solution space is the kernel W of F: X — y+s+t = X — y+s+t = X + 2s - t = or y + s - 2t = or y + s - 2t = x + y + 3s-3t = 2y + 2s - 4t = The free variables are s and t; hence dim W = 2. Set (a) s = —1, f = to obtain the solution (2, 1, —1, 0), (6) s = 0, t = l to obtain the solution (1, 2, 0, 1). Thus {(2, 1, -1, 0), (1, 2, 0, 1)} is a basis of W. (Observe that dim C/ + dim IT = 2 + 2 = 4, which is the dimension of the domain R* of F.) 6.19. Let T:W^W be the linear mapping defined by T{x, y, z) = {x + 2y — z, y + z, X + y — 2z) Find a basis and the dimension of the (i) image U of T, (ii) kernel W of T. (i) The images of generators of R^ generate the image U of T: r(i,o,o) = (1,0,1), r(o,i,o) = (2,i,i), r(o, o, i) = (-1, i, -2) Form the matrix whose rows are the generators of U and row reduce to echelon form: (" ° '\ to 1-1 to 1 1 2 1 1 1 1 -2 1 1 1 -1 1 -1 1 1 1 -1 Thus {(1, 0, 1), (0, 1, -1)} is a basis of U, and so dim U = 2. (ii) We seek the set of {x,y,z) such that T(x,y,z) = (0,0,0), i.e., T(x, y,z) = {x + 2y - z, y + z, X -\- y -2z) = (0, 0, 0) Set corresponding components equal to each other to form the homogeneous system whose solution space is the kernel W of T: X + 2y — z = x + 2y - z = X + 2y - z = y + z = a or y + z = or y + z = a; + y — 2z = a —y — z = The only free variable is z; hence dim W = 1. Let z — 1; then y = —1 and a; = 3. Thus {(3, —1, 1)} is a basis of W. (Observe that dim U + dim W = 2 + 1 = 3, which is the dimen- sion of the domain R3 of T.) 6.20. Find a linear map F : R^ ^ R* whose image is generated by (1, 2, 0, —4) and (2, 0, —1, —3). Method 1. Consider the usual basis of R^: e^ = (1, 0, 0), eg = (0, 1. 0), eg = (0, 0, 1). Set F(ei) = (1, 2, 0, -4), F(e2) = (2, 0, —1, —3) and F{eg) = (0, 0, 0, 0). By Theorem 6.2, such a linear map F exists and is unique. Furthermore, the image of F is generated by the F(ej); hence F has the required property. We find a general formula for F(x, y, z): F(x, y, z) = F{xei + ye^ + zeg) = xFie-^) + yF{e2) + 2^(63) = x(\, 2, 0, -4) + 2/(2, 0, -1, -3) + 2(0, 0, 0, 0) =z (x + 2y, 2x, —y, —4x — 3y) 138 LINEAR MAPPINGS [CHAP. 6 Method 2. Form a 4 X 3 matrix A whose columns consist only of the given vectors; say, 1 2 2\ A = 2 -1 -1 -4 -3 -3 Recall that A determines a linear map A : R3 ^ B^ whose image is generated by the columns of A. Thus A satisfies the required condition. 6.21. Let V be the vector space of 2 by 2 matrices over R and let M = | j . Let F:V^Y be the linear map defined by F{A^ = AM — MA. Find a basis and the dimension of the kernel W of F. We seek the set of (^ ^\ such that Fr' J = (p q) • K: :) - C X I) - {I DC ' _ /x 2x + 3y\ _ / \s 2s + St ) \ /-2s 2x + 2y- 2t\ _ /« \-2s 2s y ~ \0 Thus 2x + 2y -2t = x + y - t = or 2s = s = The free variables are y and t; hence dim W — 2. To obtain a basis of W set (a) y — —1, t = to obtain the solution x = 1, y = —1, s = 0, t = 0; (6) y — 0, t — 1 to obtain the solution x = 1, y = 0, s = 0, t = 1. t X + 2s y + 2t 3s 3( ^^"^{(o o)' G ;)} i« a basis of T^. 6.22. Prove Theorem 6.3: Let F.V^U be a linear mapping. Then (i) the image of F is a subspace of U and (ii) the kernel of F is a subspace of V. (i) Since F(Q) = 0, G Im F. Now suppose u, u' GlmF and a,b& K. Since u and u' belong to the image of F, there exist vectors v,v' GV such that F(v) = u and F(v') = u'. Then F{av + bv') - aF(v) + hF(v') = au + bu' e Im F Thus the image of F is a subspace of U. (ii) Since F(0) = 0, G Ker F. Now suppose v,wG Ker F and a,b e K. Since v and w belong to the kernel of F, F{v) = and F(w) = 0. Thus F(av + bw) = aF(v) + bF{w) ==: aO + 60 = and so av + bw S KerF Thus the kernel of F is a subspace of V. 6.23. Prove Theorem 6.4: Let V be of finite dimension, and let F:V-^U be a linear map- ping with image U' and kernel W. Then dim U' + dim W = dim V. Suppose dim V = n. Since W ia a. subspace of V, its dimension is finite; say, dim W = r — n. Thus we need prove that dim U' = n — r. CHAP. 6] LINEAR MAPPINGS 139 Let {wi, . . . , Wr) be a basis of W. We extend {wj to a basis of V: {w'l Wr,Vi, ...,i;„_J Let B = {F{Vi),F(v2), ...,F(v„^r)} The theorem is proved if we show that B is a basis of the image U' of F. Proof that B generates U'. Let u S U'. Then there exists v &V such that F(v) — u. Since {Wj, Vj} generates V and since v S V, V — OjWj + • • • + a,Wr + ^l'"! + • • • + b^-r'^n-r where the a„ ftj are scalars. Note that F(Wi) — since the Wj belong to the kernel of F. Thus u = jF'(t') = F(aiici + • • • + a^Wf + biv^ + • • • + b„^^v„-r) = aiF{wi) + ■■■ + a^(Wr) + b^Fivi) + ■■■ + 6„_^F(i;„_,) = OjO + • • • + a^O + biF(Vi) + • • ■ + bn-rF(Vn-r) = b,F(v,) +■■■+ 6„_,FK_,) Accordingly, the F{Vf) generate the image of F. Proof that B is linearly independent. Suppose a^Fivi) + a.2F(v2) + • • • + a„_ri^K_,.) = Then F(aiVi + 02^2 + • • • + a^^^v^-r) = and so a^Vi + • • • + a„_,T;„_^ belongs to the kernel W of F. Since {wj generates W, there exist scalars 61, . . . , 6^ such that a^Vi + a2^'2 + • • • + an-r'Un-, = b^Wi + 62^2 + • • • + b^Wr or ail^i + • • • + an-r'Wn-r — b^Wi — • ■ • — fe^w^ = (*) Since {tWj, ■«{} is a basis of V, it is linearly independent; hence the coefficients of the W; and Vj in (*) are all 0. In particular, Oj = 0, . . ., a„_r = 0- Accordingly, the F(v^ are linearly independent. Thus B is a basis of V, and so dim V — n — r and the theorem is proved. 6.24. Suppose f:V-*U is linear with kernel W, and that f{v) = u. Show that the "coset" V + W = {v + w: w e W} is the preimage of u, that is, f~^{u) — v + W. We must prove that (i) f~Hu)cv + W and (ii) v + T^ c/-i(m). We first prove (i). Suppose v'Gf-Hu). Then f(v') = u and so f(v' - v) = f(v') - f{v) = u-u = 0, that is, v'-vGW. Thus v' = V + (v' — v) €. V + W and hence f~Hu) Cv + W. Now we prove (ii). Suppose v' G v+W. Then v' = 1; + w where w G W. Since W is the kernel of /, f(w) = 0. Accordingly, f{v') = /(-u + w) = f(v) + f(w) = /(t)) + = f(v) = m. Thus v' e /-i(m) and so v + Wc f-^(u). SINGULAR AND NONSINGULAR MAPPINGS 6.25. Suppose F:V ^U is linear and that V is of finite dimension. Show that V and the image of F have the same dimension if and only if F is nonsingular. Determine all nonsingular mappings T : R* ^ R^. By Theorem 6.4, dim F = dim (Im/f) + dim (Ker/i^). Hence V and ImF have the same di- mension if and only if dim (Ker F) = or KerF = {0}, i.e. if and only if F is nonsingular. Since the dimension of R^ is less than the dimension of R*, so is the dimension of the image of T. Accordingly, no linear mapping T : B* -» R^ can be nonsingular. 6.26. Prove that a linear mapping F:V-*U is nonsingular if and only if the image of an independent set is independent. Suppose F is nonsingular and suppose {v^, . . ., v^} is an independent subset of V. We claim that the vectors F(vi) F(vJ are independent. Suppose aiF{Vi) + a<^{v^ +;•••+ a„F{v„) - 0, where a, e X. Since F is linear, F(ajVi + a^v^, + • • • + o^vj = 0; hence a^Vy + 021^2 + • • • + On^n ^ Ker F 140 LINEAR MAPPINGS [CHAP. 6 But F is nonsingular, i.e. Ker F = {0}; hence a^v^ + a^v^ +■■■ + a„v„ = 0. Since the i;; are linearly independent, all the a^ are 0. Accordingly, the F(v>i are linearly independent. In other words, the image of the independent set {v^, . . . , i)„} is independent. On the other hand, suppose the image of any independent set is independent. If v G V is nonzero, then {v} is independent. Then {F{v)} is independent and so F(v) # 0. Accordingly, F is nonsingular. OPERATIONS WITH LINEAR MAPPINGS 6.27. Let F:W^W and G:W^ W be defined by F{x, y, z) = {2x, y + z) and G{x, y, z) = (x -z,y). Find formulas defining the mappings F + G,ZF and 2F - 5G. (F + G)(x, y, z) = F(x, y, z) + G(x, y, z) = (2x, y + z) + (x — z,y) = (3x -z,2v + z) (3F)(a;, y, z) = ZF(x, y, z) = 3(2*, y + z) = {Qx, By + 3z) (2F - 5G){x, y, z) = 2F(x, y, z) - 5G{x, y, z) = 2(2a;, y + z) - 5{x - z, y) = (Ax, 2y + 22) + (-5a; + 5z, -5y) = (-x + 5z, -2,y + 2z) 6.28. Let F:W-^W and G/R'^W be defined by F(x,y,z) = {2x,y + z) and G{x,y) - {y, x) . Derive formulas defining the mappings G°F and FoG. (GoF){x,y,z) = G(F{x,y,z)) = G{2x, y + z) = (y + z, 2x) The mapping F ° G is not defined since the image of G is not contained in the domain of F. 6.29. Show: (i) the zero mapping 0, defined by 0{v) = for every v GV, is the zero ele- ment of Hom(F, U); (ii) the negative of F G Hom(7, U) is the mapping {-1)F, i.e. -F = (-l)F. (i) Let F G Hom {V, U). Then, for every v GV, {F + Q){v) = F(,v) + 0{v) = F{v) + = F(v) Since (F + 0)(v) = F(v) for every v eV, F + = F. (ii) For every v G V, [F + {-l)F){v) = F{v) + {-l)F{v) = F{v) - F{v} = = 0{v) Since {F + {-l)F){v) = 0(v) for every vGV, F + (-l)F = 0. Thus (-l)F is the negative of F. 6.30. Show that for Fi, ...,F„G Hom {V, U) and ai, ...,a„GK, and for any vGV, {aiFi + a2F2 H + a„F„)(i;) = aiFi{v) + aJFiiv) + ■ ■ • + ajf'niv) By definition of the mapping aiFj, (a^F^iv) = a^F^{v); hence the theorem holds for n = 1. Thus by induction, (aiFi + (I2F2 + • • ■ + a„F„)(i;) = (a^F^)(v) + {a^F^ + • ■ • + a„F„)(i;) = aiFiCv) + a^F^iv) + • • • + a„F„(D) 6.31. Let /^:R3^R2, G.W^B? and HrR^^R^ be defined by i^'Cx, i/, 2) = {x + y + z,x + y), G{x, y, z) = {2x + z,x + y) and i?(a;, y, z) = {2y, x). Show that F,G,H G Hom (RS R2) are linearly independent. Suppose, for scalars a,b,c G K, aF + bG + cH = {1) (Here is the zero mapping.) For e^ = (1, 0, 0) G R3, we have (aF + bG + cH)(e{) = aF(l, 0, 0) + bG(l, 0, 0) + cH(l, 0, 0) = a(l, 1) + 6(2, 1) + c(0, 1) = (a + 2b,a + b + c) CHAP. 6] LINEAR MAPPINGS 141 and 0(ei) = (0, 0). Thus by {!), (a + 2b, a+b + e) = (0, 0) and so a + 26 = and a + 6 + c = (2) Similarly for eg = (0, 1, 0) e R3, we have (aF + bG + cH){e2) = aF(0, 1, 0) + 6G(0, 1, 0) + cH(0, 1, 0) = a(l, 1) + 6(0, 1) + c(2, 0) = (a+2c, a+6) = 0(62) = (0,0) Thus a + 2e = and a + 6 = (5) Using (2) and (5) we obtain a = 0, 6 = 0, c = (■*) Since (1) implies (4), the mappings F, G and H are linearly independent. 6.32. Prove Theorem 6.7: Suppose dim y = m and dim U = n. Then dim Hom {V, U) - mn. Suppose {vi, . . .,v„} is a basis of V and {mj, . . .,m„} is a basis of V. By Theorem 6.2, a linear mapping in Hom {V, V) is uniquely determined by arbitrarily assigning elements of t/ to the basis elements Vj of V. We define F^ e Hom {V,U), i = 1, . . . , m, j = 1, ...,n to be the linear mapping for which Fij{v^ = Uj, and Fij(Vk) -0 for fe # i. That is, Fy maps Vi into Mj and the other v's into 0. Observe that {Fy} contains exactly mn elements; hence the theorem is proved if we show that it is a basis of Hom {V, U). Proof that {Fy} generates Hom (F, U). Let F G Hom {V, U). Suppose F{vi) = w^, F(v2) = W2, ..., F(Vm) = Wm- Since w^ G U, it is a linear combination of the u's; say, Wk = afclMl + «fc2«*2 + • • • + fflfc„Mn> fc = 1, . . . , m, Oy G X (i) TTi n Consider the linear mapping G = 2 2 ayFy Since G is a linear combination of the Fy, the i=l i=l proof that {Fy} generates Hom (V, t7) is complete if we show that F = G. We now compute G(Vk), k = l, ...,m. Since Fy('Ufc) = for k^i and ^^((Vfc) = Mj, m n n t G(i;k) = 22 OiiF«('yic) = 2 OfciJ^)cj(vic) = 2 Ofci«j i=l 3 = 1 3 = 1 3 = 1 = a^iMj + ak2'"-2 + • • • + »fcnMn Thus by (1), G{v^,) = w^. for each k. But ^(1;^) = w^ for each fe. Accordingly, by Theorem 6.2, F = G; hence {Fy} generates Hom (V, U). Proof that {Fy} is linearly independent. Suppose, for scalars ay G K, 2 2 »«^« - i=l 3 = 1 For i;^, fc = 1, . . .,w, = 0(v^) = 22 «ii^ij(^ic) = 2 a^jF^j(v^) = 2 aicjMi i = l j = l 3 — 1 3 — i = afcl^l + ak2M2 + • • • + fflfen^n But the Mj are linearly independent; hence for k = 1, . . .,m, we have a^i — 0, 0^2 = 0, . . . , ajj„ = 0. In other words, all the ay = and so {Fy} is linearly independent. Thus {Fy} is a basis of Hom (V, 17); hence dim Hom {V, U) = mn. 6.33. Prove Theorem 6.8: Let V, U and W be vector spaces over K. Let F, F' be linear mappings from V into f7 and let G, G' be linear mappings from U into W; and let k&K. Then: (i) Go(F + F') - Goi?' + Goii''; (ii) {G + G')oF = GoF + G'oF; (iii) fcCGoF) = {kG)oF = Go(fcF). (i) For every v GV, 142 LINEAR MAPPINGS [CHAP. 6 (Go(F + F'mv) = G{(F + F'){v)) = G{F(v) + F'(v)) = G(F{v)) + G{F'(v)) = {G'>F)(v) + {GoF')(v) = {G ° F + G o F'){v) Since {G ° (F + F'){v) = (G o F + G ° F'){v) for every vGV, Go {F + F') = G°F + G°F'. (ii) For every v &V, {(G + G')°F)(v) = {G + G')(F{v)) = G{F{v)) + G'{F(v)) = (Go F)(v) + {G' °F){v) = (G ° F + G' o F)(v) Since ({G + G') ° F}(v) = {G ° F + G ° F')(v) for every v&V, (G + G')° F = G°F + G' °F. (iii) For every v GV, (k{G°F))(v) = k(G°F){v) = k{G{F(v))) = {kG)(F{v)) = (feG°F)(i;) and {k{G°Fmv) = k(GoF){v) = k(G{F(v))) = G{kF{v)) = G{(kF){v)) = {G°kF){v) Accordingly, k{G°F) = (kG)oF = G°(kF). (We emphasize that two mappings are shown to be equal by showing that they assign the same image to each point in the domain.) 6.34. Let F:V^V and G.U^W be linear. Hence {GoF):V^W is linear. Show that (i) rank {GoF) ^ rank G, (ii) rank (GoF) ^ rank F. (i) Since F{V) c U, we also have G(F{V)) c G(U) and so dim G(F{V)) ^ dim G(V). Then rank (GoF) = dim ((GoF)(y)) = dim (G(F(y))) ^ dim G(?7) = rank G (ii) By Theorem 6.4, dim (G(F(y))) ^ dim F(y). Hence rank (GoF) = dim ((Go F)(y)) = dim (G(F(y))) =£ dim F(y) = rank F ALGEBRA OF LINEAR OPERATORS 6.35. Let S and T be the linear operators on R^ defined by S{x, y) = {y, x) and T{x, y) = (0, x). Find formulas defining the operators S + T,2S- ZT, ST, TS, S^ and T^. {S+T){x,y) = S(x,y) + T(x,y) = {y,x) + (0,x) = {y,2x). (2S-ZT)(x,y) = 2S(x,y)-3T{x,y) = 2{y,x) - Z((i,x) = (2y,-x). (ST)(x,y) = S{.T(x,y)) = S(f),x) - (a;,0). (TS)(x,y) = T(S(x,y)) = T(y,x) = {0,y). SHx,y) = S{S{x,y)) = S{y,x) = (x,y). Note S^ = I, the identity mapping. THx, y) = T(T(x, y)) = 7(0, x) - (0, 0). Note T^ = 0, the zero mapping. 6.36. Let T be the linear operator on R^ defined by r(3, 1) = (2, -4) and T{1, 1) = (0, 2) (i) (By Theorem 6.2, such a linear operator exists and is unique.) Find T{a, b). In particular, find r(7, 4). First write (a, 6) as a linear combination of (3,1) and (1, 1) using unknown scalars x and y: (a, h) = a;(3, 1) + y(l, 1) (,2) 'Zx + y = a Hence (a, b) = {Sx, x) + (y, y) = {Sx + y, x + y) and so [^ X + y — b Solving for x and y in terms of a and b, X = ^o — ^6 and y = -|a + f 6 (5) Now using (2), {1) and (3), T(a, b) = xT{3, 1) + yT(l, 1) = oo(2, -4) + 2/(0, 2) = {2x, -4x) + (0, 2y) = (2a;, -4a; + 2y) = (a-b,5b- 3a) Thus m, 4) = (7 - 4, 20 - 21) = (3, -1). CHAP. 6] LINEAR MAPPINGS 143 6.37. Let T be the operator on R^ defined by T{x, y, z) = {2x, 4a; — y,2x + 3y-z). (i) Show that T is invertible. (ii) Find a formula for T~^. (i) The kernel W of T is the set of all (x, y, z) such that T{x, y, z) = (0, 0, 0), i.e., T(», y, z) = (2a;, 4x -y,2x + Sy-z) = (0, 0, 0) Thus W is the solution space of the homogeneous system 2a; = 0, 4x - y = 0, 2x + Sy - z = which has only the trivial solution (0, 0, 0). Thus W — {0}; hence T is nonsingular and so by Theorem 6.9 is invertible. (ii) Let (r, s, t) be the image of (x, y, z) under T; then (a;, y, z) is the image of (r, s, t) under T^k T(x, y, z) = (r, s, f) and T-^r, s, t) = (x, y, z). We will find the values of x, y and z in terms of r, s and t, and then substitute in the above formula for T~^. From T{x, y, z) = (2a;, 4a; -y,2x + 3y-z) = {r, s, t) we find X = ^r, y = 2r — s, z = 7r — Ss — t. Thus T~^ is given by r-i(r, s, t) = (^r,2r-s,lr-3s-t) 6.38. Let V be of finite dimension and let T be a linear operator on V. Recall that T is invertible if and only if T is nonsingular or one-to-one. Show that T is invertible if and only if T is onto. By Theorem 6.4, dim V = dim (Im T) + dim (Ker T). Hence the following statements are equivalent: (i) T is onto, (ii) Im T = V, (iii) dim (Im r) = dimV, (iv) dim (Ker T) = 0, (v) Ker T = {0}, (vi) T is nonsingular, (vii) T is invertible. 6.39. Let V be of finite dimension and let T be a linear operator on V for which TS = I, for some operator S on V. (We call S a right inverse of T.) (i) Show that T is invertible. (ii) Show that S = T~^. (iii) Give an example showing that the above need not hold if V is of infinite dimension. (i) Let dim V = n. By the preceding problem, T is invertible if and only if T is onto; hence T is invertible if and only if rank T = n. We have n = rank / = rank TS — rank T — n. Hence rank T = n and T is invertible. (ii) rr-i = r-ir = /. Then s = /s = (r-ir)s = r-i(rs) = r-»/= r-i. (iii) Let V be the space of polynomials in t over K; say, p(t) = ao + Oji + Ojf^ + • • ■ + a„t". Let T and S be the operators on V defined by T(p{t)) = + ai + a2t+ ■■■ + a„«""i and S{p(t)) = a^t + a^t^ + • • ■ + a„t«+i We have (rS)(p(«)) = T(S{p{t))) = r(aot + Oit2 + • • • + a„<»+i) = Oo + ajt + • • • + o„<« = p{t) and so TS = I, the identity mapping. On the other hand, ii k G K and fc # 0, then (,ST){k) = S(T(k)) = S(0) = 0¥'k. Accordingly, ST ¥= I. 6.40. Let S and T be the linear operators on R^ defined by S{x, y) = (0, x) and T{x, y) - {x, 0). Show that TS = but ST # 0. Also show that T^ = T. (TS){x,y) = T(S{x,y)) = T{0,x) = {0,0). Since TS assigns = (0,0) to every (a;,j/)GR2, it is the zero mapping: TS = 0. (ST){x,y)=S(T{x,y)) = S(x,0) = (0,x). For example, (Sr)(4, 2) = (0, 4). Thus ST ¥- 0, since it does not assign = (0, 0) to every element of R^. For any {x,y) € R2, T^x.y) = T(T(x,y)) = T{x,0) = {x,Q) = T{x,y). Hence T^ =- T. 144 LINEAR MAPPINGS [CHAP. 6 MISCELLANEOUS PROBLEMS 6.41. Let {ei, ei, 63} be a basis of Y and {/i, /z) a basis of TJ. Let T-.V^U be linear. Furthermore, suppose T{ei) = aifi + 02/2 2^(62) = &i/i + b2/2 and A = ('''/ ''^ T(e3) - Ci/i + C2/2 ^ Show that, for any v GV, A[v]e = [T{v)]f where the vectors in K^ and K^ are written as column vectors. h\ Suppose V = fejei + fc2e2 + ^363; then [f]e = ^2 ■ Also, T(v) = kiT{ei) + k^T^e^) + ksTie^) - kiiaJi + 0^2) + kiibJi + 62/2) + hi'ifi + "2/2) = (Olfcl + feifeg + Cifcg)/! + {a2ki + 62^2 + C2fc3)/2 Accordmgly, [Tiv)], = { ^,\ + ,[,1 + :^,l Computing, we obtain ^Me =(„_._ )lfc2) - [a,k, + b,k2+c,kj - f^(^)l' 6.42. Let A; be a nonzero scalar. Show that a linear map T is singular if and only if kT is singular. Hence T is singular if and only if —T is singular. Suppose T is singular. Then T{v) = for some vector v ¥= 0. Hence ikT){v) = kT(v) = &0 = and so kT is singular. Now suppose kT is singular. Then {kT^(w) — for some vector w # 0; hence ^(fcw) = kT(w) = (kT)(w) = 0. But fc # and w #^ implies kw ¥= 0; thus T is also singular. 6.43. Let £• be a linear operator on V for which E^ = E. (Such an operator is termed a projection.) Let C/ be the image of E and W the kernel. Show that: (i) if m G C/, then £'(m) - u, i.e. £7 is the identity map on U; (ii) if E ^I, then ^ is singular, i.e. E{v) = for some v^O; (iii) V = U®W. (i) If u&TJ, the image of S, then E{v) = u for some v GV. Hence using E^ — E, we have u ^ E(v) = EHv) = E(E{v)) = E{u) (ii) U E ¥= I then, for some v £ F, Bl'y) = u where v v^ m. By (i), E(u) - u. Thus E'(v — m) = S(v) — S(m) = m — m = where v — u¥=0 (iii) We first show that V - U + W. Let v e y. Set m = E(v) and w = v- E{v). Then ■U = £(l>) + I) — £'('U) = M + Of By definition, m = E{v) S J7, the image of E. We now show that w e TF, the kernel of E: E(w) = E{v - E(v)) = E(v) - E^v) = E(v) - E(v) - and thus w G W. Hence V = U + W. We next show that UnW - {0}. Let v eUnW. Since vGU, E(v) = v by (i). Since v&W, E{v) = 0. Thus V = E(v) = and so UnW - {0}. The above two properties imply that V = U ® W. CHAP. 6] LINEAR MAPPINGS 145 6.44. Show that a square matrix A is invertible if and only if it is nonsingular. (Compare with Theorem 6.9, page 130.) Recall that A is invertible if and only if A is row equivalent to the identity matrix 7. Thus the following statements are equivalent: (i) A is invertible. (ii) A and 1 are row equivalent, (iii) The equations AX = and IX = have the same solution space, (iv) AX = has only the zero solu- tion, (v) A is nonsingular. Supplementary Problems MAPPINGS 6.45. State whether each diagram defines a mapping from {1, 2, 3} into {4, 5, 6}. 6.46. Define each of the following mappings / : R -> R by a formula: (i) To each number let / assign its square plus 3. (ii) To each number let / assign its cube plus twice the number. (iii) To each number — 3 let / assign the number squared, and to each number < 3 let / assign the number —2. 6.47. Let /:R^R be defined by f(x) = x^-4x + 3. Find (i) /(4), (ii) /(-3), (iii) /(j/ - 2a;), (iv)/(a!-2). 6.48. Determine the number of different mappings from {o, 6} into {1, 2, 3}. 6.49. Let the mapping g assign to each name in the set {Betty, Martin, David, Alan, Rebecca} the number of different letters needed to spell the name. Find (i) the graph of g, (ii) the image of g. 6.50. Sketch the graph of each mapping: (i) f(x) = ^x — 1, (ii) g(x) = 2x^ — 4x — 3. 6.51. The mappings f:A-^B, g:B-^A, h:C-*B, F-.B^C and GiA^C are illustrated in the diagram below. Determine whether each of the following defines a composition mapping and, if it does, find its domain and co-domain: {\)g°f, {n)h°f, (iii) Fo/, (iv)G°f, {y)g°h, (vi) h°G°g. 6.52. Let /:R^R and fir : R -^ R be defined by f(x) = x^ + Sx + l and g(x) = 2x-3. Find formulas defining the composition mappings (i) f°g, (ii) g°f, (iii) g°g, (iv) f°f. 6.53. For any mapping f:A->B, show that 1b° f — f — f°'^A- 146 LINEAR MAPPINGS [CHAP. 6 6.54. For each of the following mappings / : R -> R find a formula for the inverse mapping: (i) f{x) = Sx - 7, (ii) fix) = x» + 2. LINEAR MAPPINGS 6.55. Show that the following mappings F are linear: (i) F : R2 ^ R2 defined by F(x, y) = {2x - y, x). (ii) F : R3 -» R2 defined by F{x, y, z) = {z,x + y). (iii) jF : R -> R2 defined by F(x) = (2.x, Zx). (iv) F : R2 ^ R2 defined by F(x, y) = [ax + hy, ex + dy) where a, 6, c, d e R. 6.56. Show that the following mappings F are not linear: (i) F : R2 ^ R2 defined by F(x, y) = (x^, y^). (ii) F : R3 ^ R2 defined by Fix, y,z) = ix + l,y + z). (iii) F : R ^ R2 defined by Fix) = ix, 1). (iv) F:R2->R defined by Fix,y) = \x-y\. 6.57. Let V be the vector space of polynomials in t over K. Show that the mappings T :V -*V and S :V -> V defined below are linear: Tiaa + ait + • • • + a^t") = a^t + a^t^ + ■ • • + a„t" + i S(ao + ai« + • • • + a„t") = Q + ax + a^t + ■ ■ ■ + aj"--^ 6.58. Let V be the vector space ot nXn matrices over K; and let M be an arbitrary matrix in V. Show that the first two mappings T -.V ^V are linear, but the third is not linear (unless M = 0): (i) r(A) = MA, (ii) TiA) - MA -AM, (iii) TiA) =^ M + A. 6.59. Find Tia, b) where T : R2 ^ R3 is defined by r(l, 2) = (3, -1, 5) and r(0, 1) = (2, 1, -1). 6.60. Find Tia, b, c) where T : RS ^ R is defined by Til, 1, 1) = 3, r(0, 1, -2) = 1 and ^(O, 0, 1) = -2 6.6L Suppose F:V -*U is linear. Show that, for any vGV, Fi-v) = -Fiv). 6.62. Let W be a subspace of V. Show that the inclusion map of W into V, denoted by i:W cV and defined by t(w) = w, is linear. KERNEL AND IMAGE OF LINEAR MAPPINGS 6.63. For each of the following linear mappings F, find a basis and the dimension of (a) its image U and (6) its kernel W: (i) F : R3 -> R8 defined by F(x, y, z) = ix + 2y,y-z,x + 2z). (ii) F : R2 ^ R2 defined by Fix,y) = ix + y,x + y). (iii) F : R3 ^ R2 defined by Fix, y,z) - ix + y,y + z). 6.64. Let V be the vector space of 2 X 2 matrices over R and let M = f j . Let F : V -* V be the linear map defined by FiA) = MA. Find a basis and the dimension of (i) the kernel TF of F and (ii) the image U of F. 6.65. Find a linear mapping F : R3 ^ RS whose image is generated by (1, 2, 3) and (4, 5, 6). 6.66. Find a linear mapping F : R* ^ RS whose kernel is generated by (1, 2, 3, 4) and (0, 1, 1, 1). 6.67. Let V be the vector space of polynomials in t over R. Let D:V -*V be the differential operator: Dif) = df/dt. Find the kernel and image of D. 6.68. Let F:V-^U be linear. Show that (i) the image of any subspace of y is a subspace of U and (ii) the preimage of any subspace of U is a subspace of V. CHAP. 6] LINEAR MAPPINGS 147 6.69. Each of the following matrices determines a linear map from K,* into R^: '12 1^ (i) A = ( 2 -1 2 -1 I (ii) B = ^1 -3 2 -2/ Find a basis and the dimension of the image U and the kernel W of each map. 6.70. Let r : C -> C be the conjugate mapping on the complex field C. That is, T(z) = z where z G C, or T(a + bi) = a— bi where a, 6 e R. (i) Show that T is not linear if C is viewed as a vector space over itself, (ii) Show that T is linear if C is viewed as a vector space over the real field R. OPERATIONS WITH LINEAR MAPPINGS 6.71. Let iJ' : R3 -» R2 and G : R^ ^ R2 be defined by F{x, y, z) = (y,x + z) and G(x, y, z) = (2«, x - y). Find formulas defining the mappings F + G and SF — 20. 6.72. Let H : R2 -♦ R2 be defined by H(x, y) — (y, 2x). Using the mappings F and G in the preceding problem, find formulas defining the mappings: (i) H°F and H °G, (ii) F°H and G°H, (in) Ho(F + G) and HoF + H°G. 6.73. Show that the following mappings F, G and H are linearly independent: (i) F,G,He Horn (R2, R2) defined by Fix, y) = {x, 2y), G{x, y) = {y,x + y), H{x, y) = (0, x). (ii) F,G,He Hom (R3, R) defined by F{x, y, z) = x + y + z, G(x, y,z) — y + z, H(x, y, z) = x — z. 6.74. For F,G & Rom {V, U), show that rank (F + G) ^ rank i^ + rank G. (Here V has finite dimension.) 6.75. Let F :V -^ U and G:U-*V be linear. Show that if F and G are nonsingular then G°F is nonsingular. Give an example where G°F is nonsingular but G is not. 6.76. Prove that Hom (V, U) does satisfy all the required axioms of a vector space. That is, prove Theorem 6.6, page 128. ALGEBRA OP LINEAR OPERATORS 6.77. Let S and T be the linear operators on R2 defined by S{x, y) — {x + y, 0) and T{x, y) = (—y, x). Find formulas defining the operators S + T, 5S - ST, ST, TS, S^ and T^. 6.78. Let T be the linear operator on R2 defined by T{x, y) — {x + 2y, 3x + Ay). Find p(T) where p{t) = t2 _ 5f _ 2. 6.79. Show that each of the following operators T on R^ is invertible, and find a formula for T~h (i) T{x, y, z) = (x-3y- 2z, y - 4«, z), (ii) T{x, y,z) = {x + z,x- z, y). 6.80. Suppose S and T are linear operators on V and that S is nonsingular. Assume V has finite dimen- sion. Show that rank (ST) = rank (TS) = rank T. 6.81. Suppose V = U ® W. Let Ei and E2 be the linear operators on V defined by Ei(v) = u, E2(v) = w, where v = u + w, ue.U,w&W. Show that: (i) Bj = E^ and eI = E2, i.e. that Ei and £?2 are "projections"; (ii) Ei + E2 — I, the identity mapping; (iii) E1E2 = and E2E1 = 0. 6.82. Let El and E2 be linear operators on V satisfying (i), (ii) and (iii) of Problem 6.81. Show that V is the direct sum of the image of E^ and the image of £2- ^ — Im £?i © Im £2- 6.83. Show that if the linear operators S and T are invertible, then ST is invertible and (ST)-^ = T-^S-^. 148 LINEAR MAPPINGS [CHAP. 6 6.84. Let V have finite dimension, and let T be a linear operator on V such that rank (T^) = rank T. Show that Ker TnlmT = {0}. MISCELLANEOUS PROBLEMS 6.85. Suppose T -.K^-^ X"» is a linear mapping. Let {e^, . . . , e„} be the usual basis of Z" and let A be the mXn matrix whose columns are the vectors r(ei), . . ., r(e„) respectively. Show that, for every vector V G R"-, T(v) = Av, where v is written as a column vector. 6.86. Suppose F -.V -* U is linear and fc is a nonzero scalar. Show that the maps F and kF have the same kernel and the same image. 6.87. Show that if F:V -^ U is onto, then dim U - dim V. Determine all linear maps T:W-*R* which are onto. 6.88. Find those theorems of Chapter 3 which prove that the space of w-square matrices over K is an associative algebra over K. 6.89. Let T :V ^ U be linear and let W he a subspace of V. The restriction of T to W is the map Tt^-.W^U defined by r^(w) = T{w), for every wGW. Prove the following, (i) T^^ is linear. (ii) Ker T^ = Ker T n W. (iii) Im T^r = T(W). 6.90. Two operators S, T G A(V) are said to be similar if there exists an invertible operator P G A{V) for which S = P-^TP. Prove the following, (i) Similarity of operators is an equivalence relation. (ii) Similar operators have the same rank (when V has finite dimension). Answers to Supplementary Problems 6.45. (i) No, (ii) Yes, (iii) No. 6.46. (i) fix) = x2 + 3, (ii) /(») = a;3 + 2a;, (iii) fix) = {"^ if » - 3 [-2 if a; < 3 6.47. (i) 3, (ii) 24, (iii) j/2 - 4xy + 4x2 ^^y + ^x + S, (iv) a;^ - 8a! + 15. 6.48. Nine. 6.49. (i) {(Betty, 4), (Martin, 6), (David, 4), (Alan, 3), (Rebecca, 5)}. (ii) Image of g - {3, 4, 5, 6}. 6.51. (i) {g o f) : A -* A, (ii) No, (iii) (F o /) : A -» C, (iv) No, (v) (goh) :C ^ A, (yi) {h°G°g) iB ^ B. 6.52. (i) (/ ° g){x) = 4a;2 - 6a; + 1 (iii) (g o g)(x) =Ax-9 (ii) (f^°/)(a;) = 2a;2 + 6a;-l (iv) (/ o /)(a;) = a;* + Ga;* + 14a;2 + 15x + 5 6.54. (i) f-Hx) = (x + 7)/3, (ii) /-!(«) = V^^^^. 6.59. T{a, b) = {-a + 26, -3a + 6, 7a - 6). 6.60. T{a, b, c) = 8o - 36 - 2c. 6.61. F(v) + F{-v) = F(v + (-■ u)) = F(0) = 0; hence F(-v) - -F{v). 6.63. (i) (a) {(1, 0, 1), (0, 1, -2)}, dim U = 2; (6) {(2, -1, -1)}, dim W = \. (ii) (a) {(1, 1)}, dim ?7 = 1; (6) {(1, -1)}, dim T^ = 1. (iii) (a) {(1, 0), (0, 1)}, dim t/ = 2; (6) {(1, -1, 1)}, dim W = \. CHAP. 6] LINEAR MAPPINGS 149 6.64. (i) U^ ")' (o i)| I'asisof KerF; dim(KerF) = 2. (") {(_2 l)' (I _2)| ^^sisof ImF; dim(ImF) = 2. 6.65. F(x, y, z) = {x + 4y, 2x + 5y, Sx + 6y). 6.66. F(x, y,z,w) = (x + y - z, 2x + y - w, 0). 6.67. The kernel of D is the set of constant polynomials. The image of D is the entire space V. 6.69. (i) (a) {(1,2,1), (0,1,1)} basis of Im A; dim(ImA) = 2. (6) {(4, -2, -5, 0), (1, -3, 0, 5)} basis of KerA; dim(KerA) = 2. (ii) (a) ImB = R3; (6) {(-1,2/3,1,1)} basis of KerB; dim(KerB) = l. 6.71. (F + G)(,x, y, z) = (y + 2z, 2x-y + z), (3F - 2G)(x, y, z) = (3j/ -Az,x + 2y + Zz). 6.72. (i) (H°F){x,y,z) = {x + z,2y), (H°G)(x,y,z) - {x-y.iz). (il) Not defined. (iii) (Ho(F + G)){x, y,z) = {HoF + Ho G)(x, y, z) = {2x-y + z, 2y + 4«). 6.77. (S + T)(x, y) = (x, x) (ST){x, y) =z {x- y, 0) (5S - 3r)(a;, y) = (5a; + 8y, -3x) (TS){x, y) = (0, a; + y) SHx, v) = (x + y, 0); note that S^ = S. Ti(x^ y) = {-X, -y); note that T^-\-I = Q, hence T is a zero of x'^ + 1. 6.78. v{T) = 0. 6.79. (i) T-Hr, s, t) = (14* + 3s + r, 4t + s, t), (ii) T-^r, s, t) = (^r + ^s, t, ^r - |s). 6.87. There are no linear maps from RS into R* which are onto. chapter 7 Matrices and Linear Operators INTRODUCTION Suppose {ei, . . . , e„} is a basis of a vector space V over a field K and, for v GV, suppose V = ttiei + 0.262 + • • • + Omen. Then the coordinate vector of v relative to {ei}, which we write as a column vector unless otherwise specified or implied, is \an Recall that the mapping v l^ [v]e, determined by the basis {Ci}, is an isomorphism from V onto the space K". In this chapter we show that there is also an isomorphism, determined by the basis {ei}, from the algebra A{V) of linear operators on V onto the algebra cA of n-square matrices over K. A similar result also holds for linear mappings F:V-^U, from one space into another. MATRIX REPRESENTATION OF A LINEAR OPERATOR Let r be a linear operator on a vector space V over a field K and suppose (ei, . . . , en} is a basis of V. Now T{ei), . . . , r(e„) are vectors in V and so each is a linear combination of the elements of the basis {e,}: T(ei) = anCi + 01262 + • • • + oi^en r(e2) = 02161 + 02262 + • • • + a2n6n T{en) = Oniei + an2e2 + • • • + o„„e„ The following definition applies. Definition: The transpose of the above matrix of coefficients, denoted by [T]e or [T], is called the matrix representation of T relative to the basis {ei} or simply the matrix of T in the basis {et}: (On 021 ... Onl ' 012 022 . . . fln2 Om a2n ... 0, Example 7.1 : Let V be the vector space of polynomials in t over R of degree ^ 3, and let D : V -* V be the differential operator defined by D{p(t)) = d{p(t))/dt. We compute the matrix of D in the basis {1, t, t^, fi}. We have: D(l) = = + Of + 0*2 + 0*3 D(t) = 1 = 1 + Ot + 0(2 + 0*3 D(fi) = It = + 2t + 0f2 + 0«3 D{fi) = 3t2 = + Ot + 3t2 + 0t3 150 CHAP. 7] MATRICES AND LINEAR OPERATORS 151 Accordingly, Example 7.2: [D] = Let T be the linear operator on R2 defined by T(x, y) — (ix — 2y, 2x + y pute the matrix of T in the basis {/i = (1, 1), /a = (-1, 0)}. We have We com- T(Ji) = r(l, 1) = (2,3) = 3(1, 1) + (-1, 0) = 3/1 + /2 Tifz) = n-1, 0) = (-4, -2) = -2(1, 1) + 2(-l, 0) = -2/1 + 2/2 (3 3 -2 2 Remark: Recall that any n-square matrix A over K defines a linear operator on K" by the map v t^ Av (where v is written as a column vector). We show (Problem 7.7) that the matrix representation of this operator is precisely the matrix A if we use the usual basis of K". Our first theorem tells us that the "action" of an operator T on a vector v is preserved by its matrix representation: Theorem 7.1: Let (ei, . . ., e„} be a basis of V and let T be any operator on V. Then, for any vector vGV, [T]e [v]e = [Tiv)]e. That is, if we multiply the coordinate vector of v by the matrix representation of T, then we obtain the coordinate vector of T{v). Example 7.3: Consider the differential operator D:V -^V in Example 7.1. Let p{t) = a+bt + cfi + dt^ and so D{p{t)) = b + 2ct + 3dt^ Hence, relative to the basis {1, t, t^, t^, [p(t)] = and [D(p{t))] = We show that Theorem 7.1 does hold here: '0 1 o\ 2 3,, iO o/\ [D][Pit)] = = [D(pm Example 7.4: Consider the linear operator T : R2 ^ R2 in Example 7.2: T{x, y) = (4a; — 2y, 2x + j Let V = (5, 7). Then V = (5,7) = 7(1, 1) + 2(-l, 0) = 7/1 + 2/2 T{v) = (6, 17) = 17(1, 1) + 11(-1, 0) = 17/i + 11/2 where /j = (1, 1) and fz = (-1, 0). Hence, relative to the basis {/i, /a), and [T(v)]f = 11 Using the matrix [T]f in Example 7.2, we verify that Theorem 7.1 holds here: <2y m,M, ^ (r^G) = (ID = i^<* 152 MATRICES AND LINEAR OPERATORS [CHAP. 7 Now we have associated a matrix [T]e to each T in A{V), the algebra of linear operators on V. By our first theorem the action of an individual operator T is preserved by this representation. The next two theorems tell us that the three basic operations with these operators (i) addition, (ii) scalar multiplication, (iii) composition are also preserved. Theorem 7.2: Let {ei, ...,e„} be a basis of V over K, and let oA be the algebra of «-square matrices over K. Then the mapping T h* [T]e is a vector space isomorphism from A{V) onto cA. That is, the mapping is one-one and onto and, for any S,T G A{V) and any keK, [T + S]e = [T]e+[S]e and [kT]e = k[T]e Theorem 7.3: For any operators S,T G A{V), [ST]e = [S]e [T]e. We illustrate the above theorems in the case dim V = 2. Suppose {ei, ez} is a basis of V, and T and S are operators on V for which T{ei) = aiei + a^ez S{ei) — CiCi + 0262 7(62) = biei + 6262 ' 8(62) = diCi + did [^i-Cy - i^i- = (:;*; Now we have {T + S){ei) - T{ei) + S(ei) - aiCi + 0262 + ciCi + 6262 = (tti + Ci)ei + (a2 + 62)62 (T + S){e2) = r(e2) + 5(62) = bid + 6262 + did + d^ez = (&i + di)ei + (62 + ^2)62 '^^"® 'tti + ci &i + di\ /«! &i\ , /ci di ^ J l^ttz + ca bz + dzj [az bzj \cz dzj Also, for k EK, we have (A;r)(ei) = fcr(ei) = k{aiei + azBz) = kaiCi + ka^ez {kT){ez) = kTiez) = k(biei + bzez) = kbiei + kbzez fkai kbi\ , /ai bi\ , .-m. Finally, we have (Sr)(ei) == S(r(ei)) = S(aiei + a2e2) = aiS(ei) + a2S(e2) = ai{ciei + CzCz) + azidiBi + dzCz) = (ttiCi + a2di)ei + (aiCz + azdz)ez (Sr)(e2) = S{T{ez)) = S(biei + bzCz) = biS{ei) + b^Siez) = bi{ciei + CzCz) + bzidiBi + dzBz) = {biCi + bzdi)ei + {biCz + bzdz)ez Accordingly, _ /aiCi + azdi biCi + bzdi\ _ /ci dA /ai bi\ _ , ._ ^ J' ~ [aicz + azdz biCz + bzdz) ~ [cz dz) \az bzj L>IJ« CHAP. 7] MATRICES AND LINEAR OPERATORS 153 CHANGE OF BASIS We have shown that we can represent vectors by n-tuples (column vectors) and linear operators by matrices once we have selected a basis. We ask the following natural question: How does our representation change if we select another basis? In order to answer this question, we first need a definition. Definition: Let [ei, . . .,e„} be a basis of V and let {/i, ...,/«} be another basis. Suppose /i = anei + ai2C2 + • • • + ai„e„ /2 = aziBi + 022^2 + • • • + a2T.e„ fn = ttnlCi + a„2e2 + • • • + UnnCn Then the transpose P of the above matrix of coeflScients is termed the transi- tion matrix from the "old" basis {d} to the "new" basis {/{}: fflll ft21 . . . ftnl ' p _ I *12 ^'■22 . . . ffin2 We comment that since the vectors fi, . . .,fn are linearly independent, the matrix P is invertible (Problem 5.47). In fact, its inverse P^Ms the transition matrix from the basis {/,} back to the basis {Ci}. Example 7.5: Consider the following two bases of R^: {ei = (1, 0), 62 = (0, 1)} and ih = (1, D, A = (-1. 0)} Then A = (1,1) = (1,0) + (0,1) = e^ + e^ /2 = (-1,0) = -(1,0) + 0(0,1) = -ei + 0e2 Hence the transition matrix P from the basis {ej} to the basis {/J is '1 -V We also have e^ = (1, 0) = 0(1, 1) - (-1, 0) = O/i - /g 62 = (0,1) = (1,1) + (-1,0) = /1+/2 Hence the transition matrix Q from the basis {/J back to the basis {e^} is IN Q = , Observe that P and Q are inverses: We now show how coordinate vectors are affected by a change of basis. Theorem 7.4: Let P be the transition matrix from a basis {Ci} to a basis {fi} in a vector space V. Then, for any vector v G V, P[v]f = [v]e. Hence [v]f = P~'^[v]e. We emphasize that even though P is called the transition matrix from the old basis {Ci} to the new basis {/i}, its effect is to transform the coordinates of a vector in the new basis {fi} back to the coordinates in the old basis {ei}. 154 MATRICES AND LINEAR OPERATORS [CHAP. 7 We illustrate the above theorem in the case dim F = 3. Suppose P is the transition matrix from a basis {61,62,63} of F to a basis {fufzifa} of V; say, A = ttiCi + 0262 + 0363 1 0.1 bi ci\ /2 = biBi + 6262 + 6363 . Hence P = 02 &2 C2 fa = C161 + 6262 + 6363 \aa ba Caj Now suppose V G F and, say, v = fei/i + /i;2/2 + fcs/s. Then, substituting for the /i from above, we obtain V = /i;i(aiei + a262 + a363) + fc2(biei + 6262 + 6363) + kaiciei + 0262 + 6363) = (aifci + bife + Cika)ei. + (azki + bzki + C2ka)e2 + {aaki + bakz + 63^3)63 Thus /jcA jaiki + bife + Cifc3^ [v], = Ikz] and [v]e - a^ki + 62^2 + dka \lc3l \a3ki + b3k2 + cakal Accordingly, / ^ ^^ ^,^\ jaikr + bik2 + cM P[v]f - I a2 &2 62^2 = ttzifci + 62^2 + 62^3 = [V]e \a3 ba Caj\kaj \a3k1 + bakz + Cakaj Also, multiplying the above equation by P~S we have P-'[v]e = P-'P[v]f = I[V], = [V], Example 7.6: Let v - (a, b) e R2. Then, for the bases of R* in the preceding example, V = (a, b) = a(l, 0) + 6(0, 1) = ae^ + be^ V ^ (a, 6) = b{l,l) + (b-a)(-l,0) = bfi + ib-a)/^ Hence [v]^ =- (J^j and Mf = (ft _ „) By the preceding example, the transition matrix P from {ej to {/J and its inverse p-i are given by I'D - '■- = (-: : We verify the result of Theorem 7.4: -M. = (_: DC) = (.!.) = .. The next theorem shows how matrix representations of linear operators are affected by a change of basis. Theorem 7.5: Let P be the transition matrix from a basis {Ci} to a basis {/i} in a vector space V. Then for any linear operator T on F, [T]t = P-i[T]eP. Example 7.7: Let T be the linear operator on E^ defined by T(x, y) = (4a; - 2j/, 2a; + j/). Then for the bases of R^ in Example 7.5, we have r(ei) = r(l, 0) := (4, 2) = 4(1, 0) + 2(0, 1) = 4ei + 2e2 ne^) = r(0,l) = (-2,1) = -2(1,0) + (0,1) = -2ei + e2 Accordingly, [T]e /4 -2 V2 1 CHAP. 7] MATRICES AND LINEAR OPERATORS 155 We compute [T]f using Theorem 7.5: m, - p-i... = (_: ixi -ixi -I) - (i -I Note that this agrees with the derivation of [T]f in Example 7.2. Remark: Suppose P - (Oij) is any «-square invertible matrix over a field K. Now if {ei, .... en} is a basis of a vector space V over K, then the n vectors /i = aiiCi + 02162 + • • • + a„ie„, i=l, . . .,n are linearly independent (Problem 5.47) and so form another basis of V. Furthermore, P is the transition matrix from the basis {«{} to the basis {/{}. Accordingly, if A is any matrix representation of a linear operator T on V, then the matrix B = P~^AP is also a matrix representation of T. SIMILARITY Suppose A and B are square matrices for vs^hich there exists an invertible matrix P such that B = P~^AP. Then B is said to be similar to A or is said to be obtained from A by a similarity transformation. We show (Problem 7.22) that similarity of matrices is an equivalence relation. Thus by Theorem 7.5 and the above remark, we have the following basic result. Theorem 7.6: Two matrices A and B represent the same linear operator T if and only if they are similar to each other. That is, all the matrix representations of the linear operator T form an equivalence class of similar matrices. A linear operator T is said to be diagonalizable if for some basis (Ci} it is represented by a diagonal matrix; the basis {«{} is then said to diagonalize T. The preceding theorem gives us the following result. Theorem 7.7: Let A be a matrix representation of a linear operator T. Then T is diagonalizable if and only if there exists an invertible matrix P such that P~^AP is a diagonal matrix. That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity transformation. We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that every operator T can be represented by certain "standard" matrices called its normal or canonical forms. We comment now that that discussion will require some theory of fields, polynomials and determinants. Now suppose / is a function on square matrices which assigns the same value to similar matrices; that is, f{A) = f{B) whenever A is similar to B. Then / induces a function, also denoted by /, on linear operators T in the following natural way: f{T) = f{[T]e), where {d} is any basis. The function is well-defined by the preceding theorem. The determinant is perhaps the most important example of the above type of functions. Another important example follows. Example 7.8: The trace of a square matrix A = (oy), written tr (A), is defined to be the sum of its diagonal elements: tr (A) = an + 022 + • • • + a„„ We show (Problem 7.22) that similar matrices have the same trace. Thus we can speak of the trace of a linear operator T; it is the trace of any one of its matrix representations: tr {T) = tr ([T]g). 156 MATRICES AND LINEAR OPERATORS [CHAP. 7 MATRICES AND LINEAR MAPPINGS We now consider the general case of linear mappings from one space into another. Let V and U be vector spaces over the same field K and, say, dim V = m and dim U = n. Furthermore, let {ei, . . . , em} and {/i, ...,/«} be arbitrary but fixed bases of V and U respectively. Suppose F:V^U is a linear mapping. Then the vectors F{ei), . .., F{em) belong to U and so each is a linear combination of the fc F{ei) = aii/i + ai2/2 + • • • + ai„fn F{e2) = ttzi/i + 022/2 + • • • + aznfn F{em) — ttml/l + dmifl + ■ ' • + Ctmn/n The transpose of the above matrix of coefficients, denoted by [F]l is called the matrix representation of F relative to the bases {ei} and {ft}, or the matrix of F in the bases {ec} and {/i}: / ftll ft21 . . . ami rmf _ ^^12 CI22 . . • ttm2 L^Je - \ din 0/2n . . • dmn The following theorems apply. Theorem 7.8: For any vector v GV, [F]l [v]e = [F{v)],. That is, multiplying the coordinate vector of v in the basis (ei} by the matrix [F]l, we obtain the coordinate vector of F{v) in the basis {fi). Theorem 7.9: The mapping F ^ [F]f is an isomorphism from Hem (V, U) onto the vector space of % X m matrices over K. That is, the mapping is one-one and onto and, for any F, G G Horn {V, U) and any fc e K, [F + G]f = [F]i + [G]/ and [kF]f = k[F]i Remark: Recall that any nxm matrix A over K has been identified with the linear map- ping from K'" into Z" given by v M' Av. Now suppose V and U are vector spaces over K of dimensions m and n respectively, and suppose {e;} is a basis of V and {fi} is a basis of U. Then in view of the preceding theorem, we shall also identify A with the linear mapping F:V^U given by [F{v)]f = A[v]e. We comment that if other bases of V and U are given, then A is identified with another linear mapping from V into U. Theorem 7.10: Let {ei}, {fi} and {Qi} be bases of V, U and W respectively. Let F:V-*U and G:U-*W be linear mappings. Then [GoFYe = [GYfWVe That is, relative to the appropriate bases, the matrix representation of the composition of two linear mappings is equal to the product of the matrix representations of the individual mappings. We lastly show how the matrix representation of a linear mapping F:V-*U is affected when new bases are selected. Theorem 7.11: Let P be the transition matrix from a basis {ei} to a basis (e,'} in V, and let Qbe the transition matrix from a basis {/i} to a basis {//} in [/. Then for any linear mapping F:V ^ U, [Ft = Q-'inp CHAP. 7] MATRICES AND LINEAR OPERATORS 157 Thus in particular, i.e. when the change of basis only takes place in JJ; and [F]l. = [F]iP i.e. when the change of basis only takes place in V. Note that Theorems 7.1, 7.2, 7.3 and 7.5 are special cases of Theorems 7.8, 7.9, 7.10 and 7.11 respectively. The next theorem shows that every linear mapping from one space into another can be represented by a very simple matrix. Theorem 7.12: Let F:V-*U be linear and, say, rankF = r. Then there exist bases of V and of V such that the matrix representation of F has the form A = I where / is the r-square identity matrix. We call A the normal or canonical form of F. WARNING As noted previously, some texts write the operator symbol T to the right of the vector V on which it acts, that is, vT instead of T{v) In such texts, vectors and operators are represented by n-tuples and matrices which are the transposes of those appearing here. That is, if then they write felCl + feez + • • • + knCn [v]e = (A;i, fe, . . ., kn) instead of [v]e = And if then they write [T]e = 'tti Oi bi b2 lCi C2 r(ei) = aiei + aid + • • • + a„en T{e2) = 6iei + 6262 + • • • + &ne„ r(e„) = ciei + 0262 + • • • + c„e„ instead of [T]e = This is also true for the transition matrix from one basis to another and for matrix rep- resentations of linear mappings F:V ^ U. We comment that such texts have theorems which are analogous to the ones appearing here. 158 MATRICES AND LINEAR OPERATORS [CHAP. 7 Solved Problems MATRIX REPRESENTATIONS OF LINEAR OPERATORS 7.1. Find the matrix representation of each of the following operators T on R^ relative to the usual basis {ei = (1, 0), 62 = (0, 1)}: (i) T{x, y) = {2y, Sx - y), (ii) T{x, y) = (3x -4y,x + 5y). Note first that if (a, b) S R2, then (a, b) = ae^ + be^. (i) r(ei) = r(l,0) = (0,3) = 0ei + 3e2 ^ .^, /O 2 and rri. = ( „ T{e^) = T{Q,1) = (2,-1) = 2ei- e^ \S -1 (ii) r(ei) = r(l,0) = (3,1) = 3ei+ 62 /3 -4 and [rig = ( Tie^) = r(0,l) = (-4,5) = -461 + 562 \1 5 7.2. Find the matrix representation of each operator T in the preceding problem relative to the basis {A = (1,3), /a = (2, 5)}. We must first find the coordinates of an arbitrary vector (a, b) G K^ with respect to the basis {/J. We have (a, b) = x(l, 3) + 2/(2, 5) = (x + 2y, Zx + 5y) or X + 2y = a and Sx + 5y = b or a; = 26 — 5a and y ~ 3a — 6 Thus (a, 6) = (26 - 5a)/i + (3a - 6)/2 (i) We have T{x, y) = (2y, Sx - y). Hence r(/i) = r(l,3) = (6,0) = -3O/1 + I8/2 r(/2) = r(2, 5) = (10, 1) = -48/i + 29/2 (ii) We have T(x, y) = (3x — 4y,x + 5y). Hence r(/i) = r(l,3) = (-9,16) = llfi-ASf^ T(h) = r(2,5) = (-14,27) = 124/1-69/2 and [T]f = and [T]f 30 -48 18 29 77 124 -43 -69 ai a2 as 61 &2 bs Ci C2 Cs 7.3. Suppose that T is the linear operator on R^ defined by T{x, y, z) = (ttiic + a2.y + aaz, bix + h^y + bsz, cix + dy + Cs^) Show that the matrix of T in the usual basis (ei} is given by [T]e - That is, the rows of [T]e are obtained from the coefficients of x, y and z in the com- ponents of T{x, y, z). T{ei) = T{1, 0, 0) = (ai, 61, Ci) = a^ei + b^ez + c^e^ 7(62) = T(0, 1, 0) = (02, 62, C2) = 0361 + 6262 + 6363 7(63) = r(0, 0, 1) = (aa, 63, C3) = agei + 6363 + 6363 Accordingly, /ai 03 aaX lT]e = (h 62 63 1 Remark: This property holds for any space K'^ but only relative to the usual basis {ei = (l, 0, ...,0), 62 = (0,1,0, ...,0), ..., e„ = (0, ...,0,1)} CHAP. 7] MATRICES AND LINEAR OPERATORS 159 7.4. Find the matrix representation of each of the following linear operators T on R^ relative to the usual basis (ei = (1, 0, 0), 62 = (0, 1, 0), 63 = (0, 0, 1)}: (i) T{x,y,z) - {2x-Zy + Az,5x-y + 2z,Ax + ly), (ii) T{x,y,z) = {2y + z,x-4:y,Zx). By Problem 7.3: (i) [T]^ = | 5 -1 2 ) , (ii) [T]^ = \ 1 -4 7.5. Let T be the linear operator on R^ defined by T(x, y, z) = {2y + z,x — Ay, 3a;). (i) Find the matrix of T in the basis {/i = (1, 1, 1), /a = (1, 1, 0), fa = (1, 0, 0)} (ii) Verify that [T], [v]f = [T{v)]s for any vector v G R^. We must first find the coordinates of an arbitrary vector (a, h, c) G R^ with respect to the basis {fvfz'fai- Write {a,b,c) as a linear combination of the /j using unknown scalars x, y and z: (a, b, c) = x{l, 1, 1) + y{l, 1, 0) + z(l, 0, 0) = (x + y + z,x + y,x) Set corresponding components equal to each other to obtain the system of equations X + y + z = a, X + y = b, x = c Solve the system for x, y and z in terms of a, b and c to find x = c, y = b ~e, z = a — b. Thus (a, 6, c) = c/i + (6 - c)/2 + (a - 6)/3 (i) Since T(x,y,z) = (2j/ + z, a; - 4j/, 3a;) r(/i) = r(l,l,l) = (3,-3,3) = 3/1-6/2 + 6/3 r(/2) = r(l,l,0) = (2,-3,3) = 3/1-6/2 + 5/3 and [T]f T{fa) = r(l,0,0) = (0,1,3) = 3/1-2/2- /a (ii) Suppose v ~ (a, b, c); then V = (a,b,c) = c/i + (6 — c)/2 + (a — 6)/3 and so [v] Also, T{v) = T(a, b, c) = (26 + c, a - 46, 3(i) = 3a/i + (-2a - 46)/2 + (-0 + 66 + c)/3 Thus and so [T{v)]f [T]f[v]f = -6 -6 -2 6-c = -2a-45 = lT(v)]f 7.6. Let A - ( j and let T be the linear operator on R^ defined by T{v) = Av (where V is written as a column vector). Find the matrix of T in each of the following bases: (i) {ei = (1, 0), 62 = (0, 1)}, i.e. the usual basis; (ii) {/i = (l,3), /2 = (2,5)}. (i) Tie,) =(l 2yi\ ^ (I) = u, + 3e, \3 4/\0/ ^3/ - - /I 2 ^(^^)={3 4)(l) =(:) =2ex + 4e2 160 MATRICES AND LINEAR OPERATORS [CHAP. 7 Observe that the matrix of T in the usual basis is precisely the original matrix A which defined T. This is not unusual. In fact, we show in the next problem that this is true for any matrix A when using the usual basis. (ii) By Problem 7.2, (o, 5) = (26 - 5o)/i + (3o - h)!^. Hence ^<« = (a X) = © = -'-'■ and thus [r]/ /-5 -8N V 6 loy 7.7. Recall that any «-square matrix A = {an) may be viewed as the linear operator T on K" defined by T{v) = Av, where v is written as a column vector. Show that the matrix representation of T relative to the usual basis {et} of K" is the matrix A, that is, [T]e = A. I Oil Ol2 • •• «lJl \/ 1 T(ei) = Aei = T(ei) Aea = fail — OiiBi + 021^2 + + a„ie„ <»12 aji2j + + «n2 Oil <*12 r(e„) = Ae„ = I «2i «22 (That is, r(e,) = Ae, is the ith column of A.) Accordingly, me = 0.11 %2 • • • ''itl \ 021 *22 • • • <*2n , Onl <'^n2 a. nn J «ln«l + 02n«2 + ••• + Onnen = A 7.8. Each of the sets (i) {l,t,e\te*) and (ii) {f^\t^*,t^e^*} is a basis of a vector space V of functions / : R -» R, Let D be the differential operator on V, that is, D{f) — df/dt. Find the matrix of D in the given basis. (i) I>(1) = = 0(1) + 0(t) + 0(e«) + O(te') I?(t) = 1 = 1(1) + 0(t) + 0(et) + O(te') £>(et) = e* = 0(1) + 0(t) + l(eO + 0(<et) i)(«e«) = e« + te* = 0(1) + 0(t) + l(e«) + l(tet) and [D] = (ii) 2)(e3«) = 3e»« = 3(e30 + 0(«e3') + 0(t263t) D(<e30 = eSt + 3«e8t = l(e30 + 3(«e3t) + 0(t2e3t) 2)(t2e3t) = 2teS» + Bt^e^* = 0(e3«) + 2(«e3') + 3(t2e3') and [D] = CHAP. 7] MATRICES AND LINEAR OPERATORS 161 7.9. Prove Theorem 7.1: Suppose {ei, . . .,e„} is a basis of V and T is a linear operator on F. Then for any vGV, [T]e [v]e = [T{v)]e. Suppose, for i — 1, . . .,n, n Tiei) = Bjiei + ffljaea + • • • + Oi„e„ = 2 %«} Then [r]e is the n-square matrix whose ith row is n Now suppose V = k^ei + kzBz + • • • + fc„e„ = 2 K^i Writing a column vector as the transpose of a row vector, [v], = (&i, fcj, ...,fe„)t (2) Furthermore, using the linearity of T, T{v) = T (^ 2 he^ = 2 hned = 2 fci (^ 2 a«e,- n / n \ n = 2(2 ««*> ) «j = 2 (oij-fci + a2jfc2 ^ h a„^fe„)ej j— 1 \ i=l / i— 1 Thus [r(i;)]g is the column vector whose jth entry is aijfci + a^jk^ + • • ■ + a„j&„ (^) On the other hand, the ith entry of [r]e[^]e is obtained by multiplying the ;th row of [T\g by [v]^, i.e. (1) by (2). But the product of (1) and (2) is (3); hence [r]c[v]e and [T(v)\g have the same entries. Thus [T], [v], = [T(v%. 7.10. Prove Theorem 7.2: Let {ei, . . . , e„} be a basis of V over X^, and let cA be the algebra of %-square matrices over K. Then the mapping T ^ [T]e is a vector space isomor- phism from A{V) onto cA. That is, the mapping is one-one and onto and, for any S,T& A{V) and any kGK, [T + S\e = {T]e + [S\e and [kT]e = k[T]e. The mapping is one-one since, by Theorem 8.1, a linear mapping is completely determined by its values on a basis. The mapping is onto since each matrix M & cA is the image of the linear operator ^ F(e^ - 2 »»«e^ i = l,...,n i=l where (wy) is the transpose of the matrix M. Now suppose, for i = 1, . . . , w, n n T{eO = 2 cmej and S{ei) = 2 SijCj i=i i=i Let A and B be the matrices A = (ay) and B = (6y). Then [r]^ = A* and [5]^ = B*. We have, for i = 1, ...,%, „ (r + SKej) = T(ei) + S{ei) = 2K + 6«)ej Observe that A + J? is the matrix (ay + 6y). Accordingly, [T + S], = (A+B)t = A' + fit = [r],+ [S]e We also have, for i = 1, .. .,n, n n (fcrXej) = k T(et) = fc 2 ayej = 2 ikaij)ej Observe that kA is the matrix (fcay). Accordingly, [kT], = (kA)t = kAt = k[T], Thus the theorem is proved. 162 MATRICES AND LINEAR OPERATORS [CHAP. 7 7.11. Prove Theorem 7.3: Let {ei, . . . , e„} be a basis of V. Then for any linear operators S,Te A{V), [ST]e = [S]e [T]e. n n Suppose r(ej) = 2 "ij^j and S(ej) = 2 6jk«/c. Let A and B be the matrices A = (ay) and 1=1 fc=l B = (bjk). Then [T]^ = A* and [S]^ - B*. We have (ST)iei) = S(7'(ei)) = sCSoije,) = 2 a«S(e,) \i— 1 / i— 1 n/« \ n / n \ = 2 a« ( 2 6ifc6fc ) = 2(2 aijftjic «k i = l \fc = l / IC=1 \3 = 1 / n Recall that AB is the matrix AB = (cjfc) where Cj^ = 2 "■iibjk- Accordingly, J — 1 [ST], = (AB)t = B*At = [S]AT]e CHANGE OF BASIS, SIMILAR MATRICES 7.12. Consider these bases of R^: {ei = (1,0), cz = (0,1)} and {/i ^ (1,3), /2 = (2,5)}. (i) Find the transition matrix P from {ei} to {/i}. (ii) Find the transition matrix Q from {/i} to {ei}. (iii) Verify that Q = P'K (iv) Show that [vy = P-^[v]e for any vector V eR^ (v) Show that [T]f = P-'[T]eP for the operator T on R^ defined by T{x, y) = {2y, Sx - y). (See Problems 7.1 and 7.2.) (i) /i = (1,3) = lei + 362 ^^^ p = /^ ^ /2 = (2,5) = 261 + 562 \3 5 (ii) By Problem 7.2, (a, 6) = (26 - 5a)/i + (3a - 6)/2. Thus 61 = (1,0) = -5/1 + 3/2 ^^^ Q^/-5 2 62 = (0,1) =2/1-/2 V 3 -1 (-) ^« = (3 5)(~3 -1) = Co 1) " ' /a\ /26-5a\ (iv) If i; = (a,6), then Me=(j,) and M/ = ( g^-^)- Hence P-'\vl -5 2\/a\ _ /-5a +26 3 -1A6/ ~ I 3a -6 /O 2\ /-30 -48 \ (v) By Problems 7.1 and 7.2; ['ne=(g_^) and [T]f = (^13 29 j" ^""^ 7.13. Consider the following bases of R«: {ei = (1,0,0), 62 = (0,1,0), 63 = (0,0,1)} and {/i = (1,1,1), /2 = (1,1,0), /3 = (1,0,0)}. (i) Find the transition matrix P f rom {ei} to {/i}. (ii) Find the transition matrix Q from {A} to {ei}. (iii) Verify that Q = P \ (iv) Show that [v]/ = P-^[v]e for any vector v G R^ (v) Show that [T]f = P ^[T]eP for the T defined by T{x, y, z) = {2y + z,x- Ay, 3a;). (See Problems 7.4 and 7.5.) (i) /l = (1,1,1) = Iei+l62+l63 /a = (1, 1, 0) = lei + 1^2 + Oeg and P = fs = (1,0,0) = lei + 062 + 063 CHAP. 7] MATRICES AND LINEAR OPERATORS 163 (ii) By Problem 7.5, (a, b, c) = cf^ + (b - c)/2 + (a - b)fs. Thus ei = (1,0,0) = 0/1 + 0/2 + 1/3 /o 1^ 62 = (0,1,0) = 0/1 + 1/2-1/3 and Q = 1-1 63 = (0,0,1) = 1/1-1/2 + 0/3 \l -1 0; '1 1 iWo 1^ (iii) PQ ^ \l 1 1 -1 I = ,1 0/\l -1 0^ (iv) It v = (a, 6, c), then [v], = \b\ and [v]f = U _ ^ ) . Thus \a- bi /O 2 1\ / 3 3 3\ (v) By Problems 7.4(ii) and 7.5, [7]^ = h -4 and [T\f = -6 -6 -2 . Thus \3 0/ \ 6 5 -1/ /O l\/o 2 l\/l 1 l\ / 3 3 3\ P-^[T\eP =0 1-11-4 1 1 = -6 -6 -2 = m, \l -1 0/\3 o/\l 0/ \ 6 5 -1/ 7.14. Prove Theorem 7.4: Let P be the transition matrix from a basis {cj} to a basis {h) in a vector space V. Then for any vGV, P[v]f = [v]e. Also, [v]f = P-^[v]e. n Suppose, for i=l,...,n, A = ajiei + 04262 + • • • + aj„e„ = 2 ^0^. Then P is the «-square matrix whose jth row is j=i (oij, a2j, .... a„j) (i) n Also suppose V - kj^ + k2f2+ ■■■ + kj„ = 2 Vj- Then writing a column vector as the transpose of a row vector, *-i [V]f = (fci, ^2, ...,fc„)t (2) Substituting for /j in the equation for v, V = 2v, = 2^i(i««e,) = i(|«iA)«i n = 2 (aijfci + a2jk2 + • • • + a„jkn)ej Accordingly, [11]^ is the column vector whose jth entry is aijfci + a2jfc2 + • • • + a„jk„ (s) On the other hand, the yth entry of Plv]f is obtained by multiplying the ith row of P by [vh, i.e. (1) by (2). But the product of (1) and (2) is (5); hence P[v]f and [v]^ have the same entries and thus PMf = Me- Furthermore, multiplying the above by P-i gives P~^[v]e = P-iP[v]f = [v]f. 7.15. Prove Theorem 7.5: Let P be the transition matrix from a basis {d} to a basis {/i} in a vector space F. Then, for any linear operator T on V, [T]t = P-i [T]eP. For any vector vGV, P-HT]^P[v]f = P-^[T],[v], = p-i[T(v)]^ = [T(v)]f. 164 MATRICES AND LINEAR OPERATORS [CHAP. 7 But [T]f[v]f = [T{v)]f; hence P-^[T],P[v]f = [T],[v]f. Since the mapping v l-» [v]f is onto K», P-i[T]^PX = [T]fX for every X £ iC«. Accordingly, P-i[r],P = [7]^. 7.16. Show that similarity of matrices is an equivalence relation, that is: (i) A is similar to A; (ii) if A is similar to B, then B is similar to A; (iii) if A is similar to B and B is similar to C then A is similar to C. (i) The identity matrix / is invertible and / = /"i. Since A = I-^AI, A is similar to A. (11) Since A is similar to B there exists an invertible matrix P such that A = P-^BP. Hence B = PAP-i = (P-i)-»AP-i and P^^ is invertible. Thus B is similar to A. (iii) Since A is similar to B there exists an invertible matrix P such that A = p-iPP, and since B is similar to C there exists an invertible matrix Q such that B = Q-^CQ. Hence A = p-iBP = P-^(Q-^CQ)P = (QP)->C(QP) and QP is invertible. Thus A is similar to C. TRACE 7.17. The trace of a square matrix A = (oij), written tr (A), is the sum of its diagonal elements: tr (A) = an + ^ + • • • + a„„. Show that (i) tr (AB) = tr (BA), (ii) if A is similar to B then tr (A) = tr (B). n (1) Suppose A = (a„) and B = (fty). Then AP = (ci^) where Cj^ = ^ aij&jfc- Thus n n n tr(AP) = 2 Cii = 2 2 ttyfeji i=l i=l }=1 n On the other hand, BA = (d^^) where dj^ = 2 6ji«Hic- Thus i=l tr(PA) = 2 dji = 2 2 6ii«« = 2 2aa6;i = tr(AP) j = l 3=1 i=l >=1 5 = 1 (ii) If A is similar to B, there exists an invertible matrix P such that A = P-^BP. Using (i), tr(A) = tr(P-iPP) = tr (PPP-i) = tr (P) 7.18. Find the trace of the following operator on R^: T{x, y, z) = (aiflj + a2y + a^z, bix + h^y + hsz, Cix + Czy + csz) We first must find a matrix representation of T. Choosing the usual basis {ej, h 62 &3 Cl C2 C3 / and tr (T) = tr ([T],) = di + 63 + C3- 7.19. Let V be the space of 2 x 2 matrices over R, and let Af = f g ^ j . Let T be the linear operator on V defined by T{A) = MA. Find the trace of T. We must first find a matrix representation of T. Choose the usual basis of V: CHAP. 7] MATRICES AND LINEAR OPERATORS 165 Then nS,) = ME, = (^l ^)(J °) = (^^ °) = IE, + OE, + 3^3 + 0^4 nE,) = ME, = ^g ^^(^J J^ = ^J J) = 0^1 + IE, + OE, + SE, T(E,) = ME, = (I ^Y° I) ^ (^ I) ::. 2E, + OE, + 4E, + OE, 3 4/Vl 0/ V4 /I 2\/0 ON /O 2\ T(E,) = ME^ = (^3 J(^^ J = (^^ J = OE, + 2E, + OE, + 4E^ Hence /l 2 0\ 10 2 [T]e = and tr (T) = 1 + 1 + 4 + 4 = 10. 3 4 \0 3 4^ MATRIX REPRESENTATIONS OF LINEAR MAPPINGS 7:20. Let if : R3 ^ R2 be the linear mapping defined by F{x, y, z) = {Sx + 2y-4z,x-5y + Sz). (i) Find the matrix of F in the following bases of R* and R^: {ft = (1, 1, 1), h = (1, 1, 0), fs = (1, 0, 0)}, {9i = (1, 3), g, = (2, 5)} (ii) Verify that the action of F is preserved by its matrix representation; that is, for any vGR^ [F]nv]f = [F{v)], . (i) By Problem 7.2, (a, b) = (26 - 5a)ffi + (3a - b)g2. Hence F(/i) - F(l,l,l) = (1,-1) = -7g,+ 4g, 7 QQ 1 Q \ 4 19 8/ F(fa) = F(1,0,0) = (3,1) = -135^1+ »ff2 (ii) If v-(x,y,z) then, by Problem 7.5, v - zf, + {y - z)/, + {x - y)/,. Also, F(v) = (Sx + 2y-4z,x-5y + 3z) = (-13a; - 20y + 26z)gi + (8a; + lly - 15z)g2, , r„, , /-13a; - 2O3/ + 26z \ and [i^(-)]a - ( 8, + 11/- 15. )• T^^ 7 -33 -13 \/ ^ /-13a; - 20j/ + 26«\ ,„, ^, 4 19 SJU-H == (8a; + ll.-15. ) = t^(^>^' 7.21. Let F:R^-^K^ be the linear mapping defined by F{Xi, X2, . . . , Xn) = {anXi + • • • + amXn, 021X1 + • • • + aanXn, . . . , OmlXi + • • • + amnXn) Show that the matrix representation of F relative to the usual bases of K" and of K" is given by (ttu ai2 ... ttln ' 0,21 CI22 ... (l2n flml (tm2 • . . dmnl 166 MATRICES AND LINEAR OPERATORS [CHAP. 7 That is, the rows of [F] are obtained from the coefficients of the Xi in the components of F{xi, . . ., x„), respectively. ^(1,0 0) = (ail, aai, ...,(i„i) /«n «12 •■• «in F{0, 1, . . . , 0) = (ai2, a22' ••■> "m2) , rpi - I "21 «22 • • • «a F{0,0, . . ., 1) = (ai„, (l2r» • • •> "rnn) y^ml «m2 • • • «tr 7.22. Find the matrix representation of each of the following linear mappings relative to the usual bases of R": (i) F : R2 ^ R3 defined by F{x, y) = {Zx -y,2x + 4y, 5x - ey) (ii) F : R* ^ R2 defined by F{x, y, s, t) = (3a; -4:y + 2s-U,hx + ly-s- 2t) (iii) F : R3 ^ R* defined by F{x, y, z) = {2x + Zy-%z,x + y + z. Ax - 5z, &y) By Problem 7.21, we need only look at the coefficients of the unknowns in F{x, y, . . .). Thus (2 3 g\ 6 0/ 7.23. Let T:R2^R2 be defined by T{x,y) = (2x-Zy,x + Ay). Find the matrix of T in the bases {ei = (1, 0), 62 = (0, 1)} and {A = (1,3), ^ = (2,5)} of R^ respectively. (We can view T as a linear mapping from one space into another, each having its own basis.) By Problem 7.2, {a,b) = (26-5o)/i + (3a-6)/2. Then r(ei) = r(l,0) = (2,1) = -8/1+ 5/2 ^^^ f ^ /-8 23 Tie^) = r(0,l) = (-3,4) ^ 23/1-13/2 ^'^ ' \ 5 "13 / 2 5 —3 \ 7.24. Let A = ( . Recall that A determines a linear mapping F-.W^B? de- \1 -4 7/ fined by F(v) = Av where v is written as a column vector. (i) Show that the matrix representation of F relative to the usual basis of R^ and of W is the matrix A itself: [F] = A. (ii) Find the matrix representation of F relative to the following bases of R^ and R*. {/i = (1, 1, 1), U = (1, 1, 0), h = (1, 0, 0)}, {^1 = (1, 3), g^ = (2, 5)} (i) F(1,0,0) = (1 _4 ^) = (1) = 261 + 162 F(0,1,0) = (j _J ^) 1 = (_J) - 561-462 / 2 5 3 \ from which W\ = { -, _. „ ) = -A- (Compare with Problem 7.7.) (ii) By Problem 7.2, (a, 6) = (26 - 5a)flri + (Za-V^g^. Then CHAP. 7] MATRICES AND LINEAR OPERATORS 167 and [F]« F(h) = (I 5 -4 -?)(; F{f2) = il 5 -4 F(h) = il 5 -4 -12 -41 ~^^ 8 24 5 ■ ^( :) - -- -12flri + 8^2 4) = = -41flri + 24fir2 I) - '- -SfiTi + 5fr2 7.25. Prove Theorem 7.12: Let F:y-*J] be linear. Then there exists a basis of Y and a basis of C7 such that the matrix representation A of i^ has the form A = ( where / is the r-square identity matrix and r is the rank ofF. \ Suppose dim V - m, and dim JJ = n. Let W be the kernel of F and V the image of F. We are given that rank F = r\ hence the dimension of the kernel of F is m — r. Let {wi, . . .,«)„_ J be a basis of the kernel of F and extend this to a basis of V: Set Ml = F('Ui), M2 = ^(va), ..., Mr = F(t)^) We note that {mi, . . . , m J is a basis of J7', the image of F. Extend this to a basis {%, . . ., M„ Mr+l, . . ., M„} of J7. Observe that F(t;i) = Ml = 1mi + 0^2 + ^(va) = M2 — 0*<i + 1^2 + + Om, + Om^+i + • + Om^ + Om^+i + • F(i;,) = M, = Omi + 0^2 + F(Wi) =0 = Omi + OM2 + + 1m^ + OMy+ 1 + F(w^_r) = = 0% + OM2 + • • • + OWr + Om^+i + Thus the matrix of F in the above bases has the required form. + 0m„ + 0m„ + 0m„ + Om„ + 0m„ Supplementary Problems MATRIX REPRESENTATIONS OF LINEAR OPERATORS 7.26. Find the matrix of each of the following linear operators T on R2 with respect to the usual basis {ei = (1, 0), 62 = (0, 1)}: (i) r(», y) = (2x -3y,x + y), (ii) T{x, y) = (5x + y,Zx- 2y). 7.27. Find the matrix of each operator T in the preceding problem with respect to the basis {/i = (1, 2), /2 = (2, 3)}. In each case, verify that {T\f{v\f - {T(v)]f for any v e R2. 7.28. Find the matrix of each operator T in Problem 7.26 in the basis {g^ = (1, 3), g^ = (1, 4)}. 168 MATRICES AND LINEAR OPERATORS [CHAP. 7 7.29. Find the matrix representation of each of the following linear operators T on R3 relative to the usual basis: (i) T(x,y,z) = (x,y,0) (ii) T{x, y, z) = (2x -7y - Az, 3x + y + 4z, 6a; - 83/ + z) (iii) T(,x, y,z) = (z,y + z, x + y + z) 7.30. Let D be the differential operator, i.e. D(f) — dfldt. Each of the following sets is a basis of a vector space V of functions / : E ^ R. Find the matrix of D in each basis: (i) {e', e^t, te^'^}, (ii) {sin t, cos t}, (iii) {e5«, te^*, t^e^t}, (iv) {1, t, sin St, cos 3t}. 7.31. Consider the complex field C as a vector space over the real field E. Let T be the conjugation operator on C, i.e. T(z) = z. Find the matrix of T in each basis: (i) {1, i), (ii) {1 + i, 1 + 2i}. 7.32. Let V be the vector space of 2 X 2 matrices over R and let Af = ( ) . Find the matrix of each of the following linear operators T on V in the usual basis (see Problem 7.19) of V: (i) T{A) = MA, (ii) T{A) = AM, (iii) T(A) =^MA- AM. 7.33. Let ly and Oy denote the identity and zero operators, respectively, on a vector space V. Show that, for any basis {ej of V, (i) [1^]^ = I, the identity matrix, (ii) [Oy]^ = 0, the zero matrix. CHANGE OF BASIS, SIMILAR MATRICES 7.34. Consider the following bases of R^: {e^ - (1, 0), eg = (0, 1)} and {/i = (1, 2), ^ = (2, 3)}. (i) Find the transition matrices P and Q from {gj} to {/J and from {/j} to {ej, respectively. Verify Q = P-i. (ii) Show that [v]^ = P[v]f for any vector v G fP. (iii) Show that [T]f - P-'^[T]^P for each operator T in Problem 7.26. 7.35. Repeat Problem 7.34 for the bases {/i = (1,2), /a = (2,3)} and {g^ = (1,3), g^ = (1.4)}. 7.36. Suppose {e^, e^} is a basis of V and T :V -^V is the linear operator for which T^e^) = Se^ — 2e2 and T{e2) = ej + 4e2. Suppose {/i, /a} is the basis of V for which /i = ei + e^ and /z = 2ei + 3e2. Find the matrix of T in the basis {/i, /j}. 7.37. Consider the bases B — {1, i} and B' = {1 + i, 1 + 2i} of the complex field C over the real field R. (i) Find the transition matrices P and Q from B to B' and from B' to B, respectively. Verify that Q = P-\ (ii) Show that [T]^, = P-'^[T]bP for the conjugation operator T in Problem 7.31. 7.38. Suppose {ej, {/J and {flrj} are bases of V, and that P and Q are the transition matrices from {ej to {/j} and from {/J to {ffj, respectively. Show that PQ is the transition matrix from {ej to {fTj}. 7.39. Let A be a 2 by 2 matrix such that only A is similar to itself. Show that A has the form A Generalize to w X w matrices. = c :) 7.40. Show that all the matrices similar to an invertible matrix are invertible. More generally, show that similar matrices have the same rank. MATRIX REPRESENTATIONS OF LINEAR MAPPINGS 7.41. Find the matrix representation of the linear mappings relative to the usual bases for R": (i) F : R3 -* R2 defined by F{x, y, z) = (2x - 4j/ + 9s, 5x + Sy- 2z) (ii) F : Ri! ^ R* defined by F{x, y) = (3a; + 4j/, 5x -2y,x + ly, ix) (iii) F:R*-*B, defined by F(x, y, s,t) = 2x + 3y-7s-t (iv) i^ : R ^ R2 defined by F(x) = (3x, 5x) CHAP. 7] MATRICES AND LINEAR OPERATORS 169 7.42. Let i<' : R3 ^ R2 be the linear mapping defined by F{x, y, z) = (2x + y — z, Sx — 2y + iz). (i) Find the matrix of F in the following bases of RS and R^: {/i = (l,l,l), /2 = (1,1,0), /a = (1,0,0)} and {jti = (1, 3), fir^ = (1, 4)} (ii) Verify that, for any vector v G R3, [F]^ [v]f = [F{v)]g. 7.43. Let {ej and {/J be bases of V, and let ly be the identity mapping on V. Show that the matrix of ly in the bases {ej and {/;} is the inverse of the transition matrix P from {e^} to {f^}^, that is, 7.44. Prove Theorem 7.7, page 155. {Hint. See Problem 7.9, page 161.) 7.45. Prove Theorem 7.8. {Hint. See Problem 7.10.) 7.46. Prove Theorem 7.9. {Hint. See Problem 7.11.) 7.47. Prove Theorem 7.10. {Hint. See Problem 7.15.) MISCELLANEOUS PROBLEMS 7.48. Let r be a linear operator on V and let W be a subspace of V invariant under T, that is, /A B\ T{W) C W. Suppose dim W — m. Show that T has a matrix representation of the form I . ) where A is an •m X to submatrix. ^ 7.49. Let y = 1/ © W, and let U and W each be invariant under a linear operator T : V -> V. Suppose /A ON , dim U = m and dim V = n. Show that T has a matrix representation of the form ( I where B A and B are mXm and nX n submatrices, respectively. '' 7.50. Recall that two linear operators F and G on y are said to be similar if there exists an invertible operator T on V such that G = T-^FT. (i) Show that linear operators F and G are similar if and only if, for any basis {ej of V, the matrix representations [F]^ and [G]g are similar matrices. (ii) Show that if an operator F is diagonalizable, then any similar operator G is also diagonalizable. 7.51. Two mX n matrices A and B over K are said to be equivalent if there exists an m-square invertible matrix Q and an n-square invertible matrix P such that B — QAP. (i) Show that equivalence of matrices is an equivalence relation. (ii) Show that A and B can be matrix representations of the same linear operator F :V -> U if and only if A and B are equivalent. (iii) Show that every matrix A is equivalent to a matrix of the form ( j where / is the r-square identity matrix and r = rank A. V ' 7.52. Two algebras A and B over a field K are said to be isomorphic (as algebras) if there exists a bijective mapping / : A -* B such that for u,v S A and k G K, (i) f{u + v) = f(u) + f(v), (ii) /(few) = fe/(w), (iii) f{uv) = f{u)f{v). (That is, / preserves the three operations of an algebra: vector addition, scalar multiplication, and vector multiplication.) The mapping / is then called an isomorphism of A onto B. Show that the relation of algebra isomorphism is an equivalence relation. 7.53. Let cA be the algebra of *i-square matrices over K, and let P be an invertible matrix in cA. Show that the map A \-^ P~^AP, where A G c/f, is an algebra isomorphism of a4 onto itself. 170 MATRICES AND LINEAR OPERATORS [CHAP. 7 Answers to Supplementary Problems 7.26. (i) /2 -3 \1 1 (ii) 6 1 3 -2 7.27. Here (a, 6) = (26 - 3a)/i + (2a - b)!^. ^., ,18 25 \ ^..^ /-23 -39 « -11 -15 j ^"^ ( 15 26 7.28. Here (a, 6) = (4a - h)gi + (6 - Za)g2. -32 -45 « (25 35/ (") V-27 -32 35 41 7.29. (i) 10 10 ,0 0, '2 -7 -4I 3 14 6-8 1 ( (iii) 7.30. (i) 1 0' 2 1 ,0 2, (ii) (iii) 5 101 5 2 ,0 5, (iv) '0 1 o\ 0-3 iO 3 0/ 7.31. (i) 1 -1 (ii) -3 4 -2 -3 7.32. (i) (ii) (iii) ^c h -6 a-d 6 c d — a — c 7.34. P = Q = -3 2 7.35. P = 3 5 -1 -2 Q = 2 5 -1 -3 7.36. 8 11 -2 -1 7.37. P = 2 -1 -1 1 7.41. (i) -4 9 3 -2 (iii) (2,3,-7,-1) (iv) 7.42. (i) 3 11 -1 -8 chapter 8 Determinants INTRODUCTION To every square matrix A over a field K there is assigned a specific scalar called the determinant of A; it is usually denoted by det(A) or |A| This determinant function was first discovered in the investigation of systems of linear equations. We shall see in the succeeding chapters that the determinant is an indispensable tool in investigating and obtaining properties of a linear operator. We comment that the definition of the determinant and most of its properties also apply in the case where the entries of a matrix come from a ring (see Appendix B). We shall begin the chapter with a discussion of permutations, which is necessary for the definition of the determinant. PERMUTATIONS A one-to-one mapping <7 of the set {1,2, . . .,«} onto itself is called a permutation. We denote the permutation <r by 1 2 ... n\ . . . , .... . . . or <T — 3i32 ■ ■ ■ 3n, where 3i = cr(i) h H . • . JnJ Observe that since o- is one-to-one and onto, the sequence /i J2 . . . jn is simply a rearrange- ment of the numbers 1, 2, . . . , «. We remark that the number of such permutations is n !, and that the set of them is usually denoted by S„. We also remark that if <7 € /S„, then the inverse mapping cr"^ G S„; and if a.rGSn, then the composition mapping o-orGSn. In particular, the identity mapping belongs to Sn. (In fact, e — 12...n.) Example 8.1 : There are 2 ! = 2 • 1 = 2 permutations in Sg: 12 and 21. Example 8.2: There are 3! = 3'2'1 = 6 permutations in S3: 123, 132, 213, 231, 312, 321. Consider an arbitrary permutation a in Sn. a = ji jz . . . jn. We say a is even or odd according as to whether there is an even or odd number of pairs (i, k) for which i> k but i precedes A; in er (*) We then define the sign or parity of a, written sgn <t, by r 1 if <7 is even sgna = J. [—1 if a IS odd 171 172 DETERMINANTS [CHAP. 8 Example 8.3: Consider the permutation a — 35142 in S5. 3 and 5 precede and are greater than 1; hence (3, 1) and (5, 1) satisfy (*). 3, 5 and 4 precede and are greater than 2; hence (3, 2), (5, 2) and (4, 2) satisfy (*). 5 precedes and is greater than 4; hence (5, 4) satisfies (*). Since exactly six pairs satisfy (*), a is even and sgn <r = 1. Example 8.4: The identity permutation e = 12 . . . n is even since no pair can satisfy (*). Example 8.5 : In S2, 12 is even, and 21 is odd. In S3, 123, 231 and 312 are even, and 132, 213 and 321 are odd. Example 8.6: Let t he the permutation which interchanges two numbers i and ; and leaves the other numbers fixed: r{i) = j, r(j) = i, T(fc) = k, k^ i, j We call T a transposition. If i < j, then T = 12 ...{i-l)}{i+l) ... ij-l)i{j+l) ...n There are 2{j — i — l) + l pairs satisfying (*): (j,i), (h^)! i^t i), where x = i+1, . . .,j—l Thus the transposition t is odd. DETERMINANT Let A — (Oij) be an «-square matrix over a field K: jdll (tl2 ... O-ln^ A _ I 0-21 0.22 ■ . ■ 0'2n \a„i CLn2 Consider a product of n elements of A such that one and only one element comes from each row and one and only one element comes from each column. Such a product can be written in the form ftlil 0.212 ■ ■ ■ ^^'n that is, where the factors come from successive rows and so the first subscripts are in the natural order 1,2, . . .,n. Now since the factors come from different columns, the sequence of second subscripts form a permutation <t = ji 32 . . . jn in Sn. Conversely, each permuta- tion in Sn determines a product of the above form. Thus the matrix A contains n\ such products. Definition: The determinant of the w-square matrix A = (Odi), denoted by det(A) or |A|, is the following sum which is summed over all permutations o- = ji jz . . . h in Sn. \A\ = 2/ (sgn o-)a'UiO,2j^ . . . a„j„ That is, 2 (sgn C7)aicr(l) tt2(T(2) • • • Ona-in) The determinant of the w-square matrix A is said to be of order n and is frequently denoted by ail ai2 0-21 0.22 Onl 0,n2 ttln 0.2n CHAP. 8] DETERMINANTS 173 We emphasize that a square array of scalars enclosed by straight lines is not a matrix but rather the scalar that the determinant assigns to the matrix formed by the array of scalars. Example 8.7: Example 8.8: Example 8.9: The determinant of a 1 X 1 matrix A — (an) is the scalar on itself: \A\ = On. (We note that the one permutation in Sj is even.) In 1S2, the permutation 12 is even and the permutation 21 is odd. Hence ''■ll'*22 '*12'*21 Thus "H «12 0-1^ a^i 4 -1 -5 -2 = 4(- -2) - (-5 -13 and a b e d = ad — be. In 1S3, the permutations 123, 231 and 312 are even, and the permutations 321, 213 and 132 are odd. Hence On ai2 ai3 0-21 '*22 "•23 131 *32 <*33 This may be written as: Oll(<l22'*33 ~ ''■23'*32) <*ll'*22'*33 + <*i2(l23a3l + di^a^ia^i — <Z'X3a220'3l — ttl2<*2l'>33 "" <*ll''23*32 *12("21<'33 ~ ®23*3l) + '*13('''2l'''32 ~ <*22'*3l) a22 «23 ~ «12 «21 "^23 + «13 021 022 O32 '''33 «31 "33 O3I 032 which is a linear combination of three determinants of order two whose coefficients (with alternating signs) form the first row of the given matrix. Note that each 2X2 matrix can be obtained by deleting, in the original matrix, the row and column containing its coefficient: «ii ttll «12 Oi3 «.l1 "23 <*33 - "12 <»11 »I2 Oi3 "21 'h.i «2:i 031 "■gi '*33 + «13 a31 Itj., 033 Example 8.10: (i) (ii) 2 3 4 6 7 5 7 5 6 6 6 7 = 2 - 3 + 4 9 1 8 1 8 9 8 9 1 2(6-63) - 3(5-56) + 4(45-48) 27 2 3 -4 —4 v 2 -4 -4 2 = 2 -1 5 - 3 1 5 + (-4) 1 -1 1 -1 5 2(-20 + 2) As n increases, the number of terms in the determinant becomes astronomical. Accord- ingly, we use indirect methods to evaluate determinants rather than its definition. In fact we prove a number of properties about determinants which will permit us to shorten the computation considerably. In particular, we show that a determinant of order n is equal to a linear combination of determinants of order m. — 1 as in case n = 3 above. PROPERTIES OF DETERMINANTS We now list basic properties of the determinant. Theorem 8.1: The determinant of a matrix A and its transpose A* are equal: \A\ = \A*\. 174 DETERMINANTS [CHAP. 8 By this theorem, any theorem about the determinant of a matrix A which concerns the rows of A will have an analogous theorem concerning the columns of A. The next theorem gives certain cases for which the determinant can be obtained immediately. Theorem 8.2: Let A be a square matrix. (i) If A has a row (column) of zeros, then \A\ = 0. (ii) If A has two identical rows (columns), then |A| = 0. (iii) If A is triangular, i.e. A has zeros above or below the diagonal, then \A\ = product of diagonal elements. Thus in particular, |/| = 1 where / is the identity matrix. The next theorem shows how the determinant of a matrix is affected by the "elementary" operations. Theorem 8.3: Let B be the matrix obtained from a matrix A by (i) multiplying a row (column) of A by a scalar fc; then |B| = fe|A|. (ii) interchanging two rows (columns) of |A|; then |Z?j = — |A|. (iii) adding a multiiJle of a row (column) of A to another; then jB] = |A|. We now state two of the most important and useful theorems on determinants. Theorem 8.4: Let A be any n-square matrix. Then the following are equivalent: (i) A is invertible, i.e. A has an inverse A~^. (ii) A is nonsingular, i.e. AX - has only the zero solution, or rank A = n, or the rows (columns) of A are linearly independent, (iii) The determinant of A is not zero: |A| ¥= 0. Theorem 8.5: The determinant is a multiplicative function. That is, the determinant of a product of two matrices A and B is equal to the product of their deter- minants: \A B\ = \A\ \B\ . We shall prove the above two theorems using the theory of elementary matrices (see page 56) and the following lemma. Lemma 8.6: Let E be an elementary matrix. Then, for any matrix A, \E A\ - \E\\A\. We comment that one can also prove the preceding two theorems directly without resorting to the theory of elementary matrices. MINORS AND COFACTORS Consider an w-square matrix A = (ay). Let M« denote the (w- l)-square submatrix of A obtained by deleting its ith row and .7th column. The determinant \Mij\ is called the mmor of the element ay of A, and we define the cof actor of Oni, denoted by A«, to be the "signed" ^^""^= A« = i-iy^^m Note that the "signs" (-1)*+^' accompanying the minors form a chessboard pattern with +'s on the main diagonal: ■ ; : ; ::-\ + - + -■■ We emphasize that My denotes a matrix whereas Ay denotes a scalar. CHAP. 8] DETERMINANTS 175 Example 8.11: Let A = 2 3 4 Then M23 = { 5 6 7 8 9 1 2 3 8 9 and = (-1)2 2 3 8 9 = -(18-24) = 6 The following theorem applies Theorem 8.7 The determinant of the matrix A = (Odj) is equal to the sum of the products obtained by multiplying the elements of any row (column) by their re- spective cofactors: „ \A\ = OiiAii + ai2Ai2 + • • • + ttinAin = 2 o^a-^a j=i and CLljAij + Ct2i-'4.2j + • • • + anjAnj — ^ OijAij The above formulas, called the Laplace expansions of the determinant of A by the ith. row and the yth column respectively, offer a method of simplifying the computation of \A\. That is, by adding a multiple of a row (column) to another row (column) we can reduce A to a matrix containing a row or column with one entry 1 and the others 0. Expanding by this row or column reduces the computation of \A\ to the computation of a determinant of order one less than that of \A\. 5 4 2 l\ 2 3 1 —2 Example 8.12: Compute the determinant of A = ' -5 -7 -3 1 -2 -1 Perform the following Note that a 1 appears in the second row, third column, operations on A, where fij denotes the ith row: (i) add -2R2 to jRi, (ii) add 3^2 to Ra, (iii) add li?2 to R^. By Theorem 8.3(iii), the value of the determinant does not change by these opera- tions; that is. \A\ = Now if we expand by the third column, we may neglect all terms which contain 0. Thus 5 4 2 1 2 3 1-2 -5 -7-3 9 1-2-1 4 1 -2 5 2 3 1 -2 1 2 3 3 1 2 \A\ = (-1)2 { 1 _2 5 2 3 1 -2 1 2 3 3 1 2 1-2 5 = -123 3 12 2 3 1 2 - (-2) 1 3 3 2 + 5 1 2 3 1 } = 38 CLASSICAL ADJOINT Consider an n-square matrix A = (an) over a field K: A = ' an ai2 0,21 ffl22 ttin tt2re 1 fflnl ffln2 176 DETERMINANTS [CHAP. 8 The transpose of the matrix of cofactors of the elements Oij of A, denoted by adj A, is called the classical adjoint of A: 'An A21 ... A„i \ adj A - '^^^ ^^^ ••• ^"^ ■^nn I We say "classical adjoint" instead of simply "adjoint" because the term adjoint will be used in Chapter 13 for an entirely different concept. Example 8.13: Let A - An = + ■^21 — — A31 = + The cofactors of the nine elements of A are -4 2 -1 5 3 -4 -1 5 3 -4 -4 2 = -18, A12 — 1 2 5 = -11, A22 - + 2 1 -4 5 = -10, A32 ~ — 2 -4 2 = 2, Ai3 = + = 14, A23 = - = -4, A33 = + -4 1 -l| 2 3 1 -1 2 3 -4 = 4 = 5 = -8 Theorem 8.8: We form the transpose of the above matrix of cofactors to obtain the classical adjoint of A: I -18 -11 -10 \ adj ^ = 2 14-4 \ 4 5-8/ For any square matrix A, A -(adj A) = (adj A) -A = \A\I where / is the identity matrix. Thus, if \A\¥' 0, Observe that the above theorem gives us an important method of obtaining the inverse of a given matrix. Example 8.14: Consider the matrix A of the preceding example for which \A\ — —46. We have /2 3 -4\ /-18 -11 -10\ /-46 0\ /l 0^ A (adj A) = -4 2 2 14 -4 = 0-46 = -46 1 \l -1 5/\ 4 5 -8/ \ -46/ \0 1^ = -46/ = |A|/ We also have, by Theorem 8.8, -18/-46 -11/-46 -10/-46\ / 9/23 11/46 5/23 \ A-i = r4T(adjA) 2/-46 14/-46 -4/-46 = -1/23 -7/23 2/23 4/-46 5/-46 -8/-46/ \-2/23 -5/46 4/23/ APPLICATIONS TO LINEAR EQUATIONS Consider a system of n linear equations in n unknowns: anX\ + ai2a;2 + • • • + ai„a;» = bi a2\X\ + a22a;2 + • • • + ainXn — &2 anliCi + an%X2 + "T annXn bn CHAP. 8] DETERMINANTS 177 Let A denote the determinant of the matrix A — (oij) of coefficients: A = \A\. Also, let As denote the determinant of the matrix obtained by replacing the ith column of A by the column of constant terms. The fundamental relationship between determinants and the solution of the above system follows. Theorem 8.9: The above system has a unique solution if and only if A ?^ 0. In this case the unique solution is given by _ Al _ ^ _ An The above theorem is known as "Cramer's rule" for solving systems of linear equations. We emphasize that the theorem only refers to a system with the same number of equations as unknowns, and that it only gives the solution when A ^ 0. In fact, if A = the theorem does not tell whether or not the system has a solution. However, in the case of a homo- geneous system we have the following useful result. Theorem 8.10: The homogeneous system Ax — has a nonzero solution if and only if A = |A| = 0. \2x — Zy = 1 Example 8.15: Solve, using determinants: < 3a; + 52/ = 1 First compute the determinant A of the matrix of coefficients: 2 -3 3 5 Since A ^^ 0, the system has a unique solution. We also have A = = 10 + 9 = 19 A. = 7 -3 1 5 = 38, 2 7 3 1 -19 Accordingly, the unique solution of the system is ^x 38 - ^y -19 *' = T"i9 = 2, ^ = T=l9- = -i We remark that the preceding theorem is of interest more for theoretical and historical reasons than for practical reasons. The previous method of solving systems of linear equa- tions, i.e. by reducing a system to echelon form, is usually much more efficient than by using determinants. DETERMINANT OF A LINEAR OPERATOR Using the multiplicative property of the determinant (Theorem 8.5), we obtain Theorem 8.11: Suppose A and B are similar matrices. Then |A| = \B\. Now suppose T is an arbitrary linear operator on a vector space V. We define the determinant of T, written det (T), by det(r) = |[r]e| where [T]e is the matrix of T in a basis {et}. By the above theorem this definition is in- dependent of the particular basis that is chosen. The next theorem follows from the analogous theorems on matrices. Theorem 8.12: Let T and iS be linear operators on a vector space V. Then (i) det (S o T) = det {S) • det {T), (ii) r is invertible if and only if det (7)^0. 178 DETERMINANTS [CHAP. 8 We also remark that det (Iv) = 1 where Iv is the identity mapping, and that det (T~^) - det(r)-i if T is invertible. Example 8.16: Let T be the linear operator on R3 defined by T(x, y, z) — (2x — 4y + z, X — 2y + 3z, 5x + y — z) '2 -4 l\ The matrix of T in the usual basis of R3 is [T] = \ X -2 3 . Then ,5 1-1/ det(r) = 2-4 1 1-2 3 5 1-1 2(2 - 3) + 4(-l - 15) + 1(1 + 10) = -55 MULTILINEARITY AND DETERMINANTS Let cA denote the set of all n-square matrices A over a field K. We may view A as an n-tuple consisting of its row vectors Ai, A2, . . ., A„: A = {Ai, A2, . . ., An) Hence cA may be viewed as the set of n-tuples of w-tuples in K: The following definitions apply. Definition: A function D.cA -^ K is said to be multilinear if it is linear in each of the components; that is: (i) if row Ai = B + C, then D{A) = D(...,B + C,...) = D{...,B, ...) + D(...,C, ...); (ii) if row Ai = kB where k G K, then D{A) = D(...,kB, ...) = kD{...,B, ...). We also say n-linear for multilinear if there are n components. Definition: A function D-.cA^K is said to be alternating if D{A) = whenever A has two identical rows: D{Ai, A2, . . ., An) = whenever A, - Aj, i¥^ j We have the following basic result; here / denotes the identity matrix. Theorem 8.13: There exists a unique function D-.oA -*K such that: (i) D is multilinear, (ii) D is alternating, (iii) D{I) = 1. This function D is none other than the determinant function; that is, for any matrix A^cA, D(A) = jA|. CHAP. 8] DETERMINANTS 179 Solved Problems COMPUTATION OF DETERMINANTS 8.1. Evaluate the determinant of each matrix: (i) (i) 3 -2 4 5 = 3-5- (-2) -4 = 23. (ii) ^4 5 a — b a a a + b (ii) a — b a a a + b = (a—b)(a+b) — a'a -62. 8.2. Determine those values of k for which k k 4 2k k k 4 2k fe = 2, the determinant is zero. = 0. = 2A;2 - 4fc = 0, or 2fc(fe - 2) = 0. Hence k - Q; and k = 2. That is, if fe = or 8.3. Compute the determinant of each matrix: '1 2 3\ /2 l\ / 2 1^ (i) [4-2 3 ^2 5 -ly (i) 1 2 3 -2 3 4 3 4 -2 4 -2 3 = 1 - 2 + 3 5 -1 2 -1 2 5 2 5 -1 (ii) (iii) (iv) 1(2-15) - 2(-4-6) + 3(20 + 4) = 79 |4 2 2 1 2 -3 4 -3 4 2 -3 = 2 3 1 - 5 1 5 3 1 + ll 6 3 2 1 3 2-3 -1 -3 5 10 3 2-4 4 13 = 2(10-9) + l(-9 + 2) = -5 = 1(6 + 4) = 10 = 24 0\ (ii) (4 2 -3 ) , (iii) ( 3 2 -3 I , (iv) I 3 2 -4 | . \-l -3 8.4. Consider the 3-square matrix A — \a2 &2 C2 \a3 bs cs be used to obtain the determinant of A: Show that the diagrams below can Form the product of each of the three numbers joined by an arrow in the diagram on the left, and precede each product by a plus sign as follows: 180 DETERMINANTS [CHAP. 8 Now form the product of each of the three numbers joined by an arrow in the diagram on the right, and precede each product by a minus sign as follows: — asftgCi — bgC^ai — Cgagfei Then the determinant of A is precisely the sum of the above two expressions: «! 61 Ci \A\ = a2 62 "2 «3 &3 "3 The above method of computing \A\ does not hold for determinants of order greater than 3. 8.5. Evaluate the determinant of each matrix: /2 -l\ /a b c> (i) 3 2 A -3 7/ (ii) (iii) / 3 2 - 10-2 \-2 3 3/ (i) Expand the determinant by the second column, neglecting terms containing a 0: 2 0-1 3 2 4-3 7 = -(-3) 2 -1 3 2 3(4 + 3) 21 (ii) Use the method of the preceding problem: a b c c a b b c a = a^ + 63 + c* — abc — abc — abc = 63 + c3 3a6c (iii) Add twice the first column to the third column, and then expand by the second row: 3 2 2 10 -2 3 -1 3 2-4 + 2(3) 1 -2 + 2(1) -2 3 3 + 2(-2) 2 2 3 -1 = 8 fi -1 8.6. Evaluate the determinant of A = 4 First multiply the first row by 6 and the second row by 4. Then 3-6-2 3 -6 + 4(3) -2 - (3) 3 2-4 = 1 2 + 4(3) -4 -(3) 1-4 1 1 -4 + 4(1) 1~(1) 6-4|A| - 24|A| = = + 6 14 28, and \A\ = 28/24 = 7/6. 3 6-5 1 14 -7 10 8.7. Evaluate the determinant of A Note that a 1 appears in the third row, first column. Apply the following operations on A (where i2j denotes the ith row): (i) add —2R3 to iSj, (ii) add 2R3 to i?2, (iii) add IB3 to R^. Thus CHAP. 8] DETERMINANTS 181 \A\ = 2 5 -3 -2 2 -3 2 -5 1 3 -2 2 1 -6 4 3 -1 1 -6 3 -2 -1 = + 1 3 -2 2 -3 2 5 -1 + 1 1 -6 + 6(1) 3-2 -2 -1 + 6(-2) -3 + 2 2 5 + 6(2) -1 1 -6 3 -2 -1 -3 2 5 1 1 -13 1 -2 -13 — — -1 17 1 2 17 = -4 8.8. Evaluate the determinant of A First reduce A to a matrix which has 1 as an entry, such as adding twice the first row to the second row, and then proceed as in the preceding problem. |A| = 3 -2 -5 4 ■5 2 8 -5 2 4 7 -3 2 -3 -5 8 3 -2 -5 4 1 -2 -2 3 2 4 7 -3 2 -3 -5 8 3 4 1 -5 1 2 3 3 2 1 -1 2 4 1-6 3 1-13 3-2 -5 4 -5 + 2(3) 2 + 2(-2) 8 + 2(-5) -5 + 2(4) -2 4 7-3 2-3-5 8 3 -2 + 2(3) -5 + 2(3) 4-3(3) 1 -2 + 2(1) -2 + 2(1) 3-3(1) -2 4 + 2(-2) 7 + 2(-2) -3-3(-2) 2 -3 + 2(2) -5 + 2(2) 8-3(2) 4 1 -5-(l) 3 3 - (3) 1 -1 2-(-l) 4 1 -5 3 3 = — 1 -1 2 4 -6 1 3 -3(12 + 6) = -54 /t + S -1 1 8.9. Evaluate the determinant of A = I 5 t-B 1 \ 6 -6 t + 4/ Add the second column to the first column, and then add the third column to the second column to obtain t + 2 1 \A\ = t + 2 t-2 1 t-2 t + 4 Now factor t + 2 from the first column and t — 2 from the second column to get 10 1 |A| = (t + 2)(t-2) 11 1 1 « + 4 Finally subtract the first column from the third column to obtain \A\ = (< + 2)(t-2) 10 11 1 t + 4 = (t+2)(t-2)(t + 4) 182 DETERMINANTS [CHAP. 8 COFACTORS 8.10. Find the cofactor of the 7 in the matrix 2 1 -3 ■1 5 -4 7 -2 ^ 4 6 -3 3 -2 5 2 2 1 4 2 1 4 / 4 -3 4 -3 rz — 4 -3 7 10 3 -2 2 7 10 \ (-1)2 The exponent 2 + 3 comes from the fact that 7 appears in the second row, third column. 61 /I 2 3\ 8.11. Consider the matrix A = 2 3 4 , (i) Compute |A| , (ii) Find adj A. (iii) Verify \l 5 7/ A • (adj A) = \A\ I. (iv) Find A-\ (i) \A\ = 1 3 4 5 7 2 4 1 7 + 3 2 3 1 sl = 1-20 + 21 = 2 (ii) adj A + 3 4 5 7 - 2 4 1 7 + 2 3 5 7 + 1 3 1 7 + 2 3 3 4 - 1 3 2 4 + That is adj A is the transpose of the matrix of cof actors. Observe that the "signs" in the u - +\ matrix of cofactors form the chessboard pattern — + — . \+ - +/ (iii) A '(adj A) = (iv) A-i = |]4i(adjA) = |A|/ a b c d 8.12. Consider an arbitrary 2 by 2 matrix A • (i) Find adj A. (ii) Show that adj (adj A) = A. (X) adj A = (^_|^| ^1^1 j - \^_^ „; - (^_, „; / +la| -l-ciy ^ /a c - \^_|_6| +|di; v'' ^ (ii) adj (adj A) = adj d -b -e a CD- = A CHAP. 8] DETERMINANTS 183 DETERMINANTS AND SYSTEMS OF LINEAR EQUATIONS 8.13. Solve for x and y, using determinants: 2a; + y = 1 , __ ax — 2hy — c (i) 3x — 5y — 4 2 1 3 -5 y = A„/A = 1. (ii) , where ab ¥^ 0. Sax — 5by — 2c -13, A:, = 7 1 4 -5 = -39, Ay = 2 7 3 -4 = -13. Then x = A^/A = 3, (ii) A = = a6, Ax = a -26 3a -56 —c/a, y — Aj,/A = —c/b c -26 2c -56 -be, Aj, = a. c 3a 2c = — ac. Then a; = A^-M = 8.14. Solve using determinants: Sy + 2x = z + 1 3a; + 22; - S - 5y . 3z - 1 = a; - 2i/ First arrange the system in standard form with the unknowns appearing in columns: 2x + 3y - z = 1 3x + 5y + 2z - 8 X -2y - 3z = -1 Compute the determinant A of the matrix A of coefficients: 2 3-1 3 5 2 A = -2 -3 = 2(-15 + 4) - 3(-9-2) - l(-6-5) = 22 Since A t^ 0, the system has a unique solution. To obtain A^., A,, and A^, replace the coefficients of the unknown in the matrix A by the column of constants. Thus A, = 13-1 8 5 2 -1 -2 -3 = 66, A„ = 2 1-1 3 8 2 1 -1 -3 -22, A, = and X - Aj./A = 3, 2/ = Aj,/A = -1, z = A^/A = 2. 2 3 1 3 5 8 1 -2 -1 = 44 PROOF OF THEOREMS 8.15. Prove Theorem 8.1: |A*| = |A|. Suppose A = (tty). Then A* = (6jj) where 6y = a^;. Hence l^'l = 2 (Sgn a) 6io.(i) 62,^(2) . . . 6„o-(n) aes„ = 2 (sgn ff)a„(i).ia<r(2),2 ••• «o-(n),n Let T = <r~i. By Problem 8.36, sgn r = sgn a, and Hence |A«| = 2 (sgn t) ai^d) a2T(2) • • • «nT(n) <Tes„ However, as a runs through all the elements of S„, t = <r~i also runs through all the elements of S„. Thus \At\ = |A|. 8.16. Prove Theorem 8.3(ii): Let B be obtained from a square matrix A by interchanging two rows (columns) of A. Then |B| = — |A|. We prove the theorem for the case that two columns are interchanged. Let t be the trans- position which interchanges the two numbers corresponding to the two columns of A that are interchanged. If A = (oy) and B — (6jj), then 6y — Ojtcj)- Hence, for any permutation a. 184 DETERMINANTS [CHAP. 8 Thus \B\ = 2 (sgn ff)6io.(i) 620.(2) ... &no-(n) = 2 (sgn a) OiTirCl) "■2to-(2) • • • "-nraCn) Since the transposition t is an odd permutation, sgn ra = sgn r • sgn <r = — sgn a. Thus sgn a = — sgn TO, and so \B\ = — 2 (sgn TOr) a.iTO-(l)a2T<r(2) • • • "^nrcrCn) But as a runs through all the elements of S„, to also runs through all the elements of S^; hence \B\ = -\A\. 8.17. Prove Theorem 8.2: (i) If A has a row (column) of zeros, then \A\ = 0. (ii) If A has two identical rows (columns), then \A\ - 0. (iii) If A is triangular, then \A\ = product of diagonal elements. Thus in particular, |/| = 1 where / is the identity matrix. (i) Each term in |A| contains a factor from every row and so from the row of zeros. Thus each term of |A| is zero and so \A\ = 0. (ii) Suppose 1 + 1 # in K. If we interchange the two identical rows of A, we still obtain the matrix A. Hence by the preceding problem, 1A| = — |A| and so \A\ = 0. Now suppose 1 + 1 = in K. Then sgn <r = 1 for every a e S„. Since A has two iden- tical rows, we can arrange the terms of A into pairs of equal terms. Since each pair is 0, the determinant of A is zero. (iii) Suppose A = (ay) is lower triangular, that is, the entries above the diagonal are all zero: a^j = whenever i < j. Consider a term t of the determinant of A: t = (sgn <r) aiij a2i2 • • • '*"*n' where <t = i^H ...in Suppose ii ¥- 1. Then 1< ii and so a^^ = 0; hence t = 0. That is, each term for which ii 7^ 1 is zero. Now suppose ti = 1 but iz ¥- 2. Then 2 < ig and so a^ = 0; hence t = 0. Thus each term for which ij ^ 1 or 12 ^ 2 is zero. Similarly we obtain that each term for which ij 7^ 1 or % # 2 or ... or t„ 9^ n is zero. Accordingly, 1A| = a^^a^^ . . . a^n = product of diagonal elements. 8.18. Prove Theorem 8.3: Let B be obtained from A by (i) multiplying a row (column) of A by a scalar fe; then |B| = fe |A| . (ii) interchanging two rows (columns) of A; then |B| = - |A|. (iii) adding a multiple of a row (column) of A to another; then |B1 = \A\. (i) If the jth row of A is multiplied by fc, then every term in |A| is multiplied by fc and so \B\ = k\A\. That is, |B| = 2 (sgn o) an a2t2 • • • C^^jiP • • • «ni„ = fc 2 (sgn a) aii a2i2 . . . Oni„ = ^ 1^1 a (ii) Proved in Problem 8.16. (iii) Suppose c times the feth row is added to the jth row of A. Using the symbol /\ to denote the yth position in a determinant term, we have \B\ = 2 (sgn or) aii aji^ . . . {ca^i^ + ajj.) . . . a„i^ = c 2 (sgn <r) a„j agi^ • . • «fci^ • • ■ ««!„ + 2 (sgn c) a^i^ a^i^. . . a^. . . . a„i^ The first sum is the determinant of a matrix whose feth and ;th rows are identical; hence by Theorem 8.2(ii) the sum is zero. The second sum is the determinant of A. Thus |B| = c'0 + 1A| = A. CHAP. 8] DETERMINANTS 185 8.19. Prove Lemma 8.6: For any elementary matrix £", l^'A] = IE"! |A|, Consider the following elementary row operations: (i) multiply a row by a constant A; # 0; (ii) interchange two rows; (iii) add a multiple of one row to another. Let E^, JS?2 and E^ be the corresponding elementary matrices. That is, Sj, E^ and E^ are obtained by applying the above operations, respectively, to the identity matrix /. By the preceding problem, l^il = k\I\ = k, \E^\ = -\I\ = -1, \E,\ = |/| = 1 Recall (page 56) that SjA is identical to the matrix obtained by applying the corresponding operation to A. Thus by the preceding problem, \E^A\ = k\A\ = \Ei\\A\, lE^A] = -\A\ = l^^l lA], \E,A\ = \A\ = 1|A| = I^g] 1A| and the lemma is proved. 8.20. Suppose B is row equivalent to A; say B = EnEn-i . . . E2E1A where the E, are elementary matrices. Show that: (i) \B\ = \En\ \Er,-i\ . . . \E2\ \Ei\ \A\, (ii) \B\ ^ if and only if \A\ ^ 0. (i) By the preceding problem, |J7iA| = |Bi| 1A|. Hence fey induction, \B\ = \E„\\E„_,...E2E,A\ = \E^\\E,_,\...\E2\\E,\\A\ (ii) By the preceding problem, ^i ^ for each i. Hence \B\ ¥= if and only if \A\ ¥- 0. 8.21. Prove Theorem 8.4: Let A be an w-square matrix. Then the following are equivalent: (i) A is invertible, (ii) A is nonsingular, (iii) |A| 9^ 0. By Problem 6.44, (i) and (ii) are equivalent. Hence it suffices to show that (i) and (iii) are equivalent. Suppose A is invertible. Then A is row equivalent to the identity matrix /. But |/| ■?* 0; hence by the preceding problem, |A| ^ 0. On the other hand, suppose A is not invertible. Then A is row equivalent to a matrix B which has a zero row. By Theorem 8.2(i), \B\ = 0; then by the preceding problem, \A\ = 0. Thus (i) and (iii) are equivalent. 8.22. Prove Theorem 8.5: \AB\ = \A\\B\. If A is singular, then AB is also singular and so \AB\ = = |A| |B|. On the other hand if A is nonsingular, then A =^ E^ . . . E2E1, a product of elementary matrices. Thus, by Problem 8.20, |A| = \E^...E^E,I\ = \E„\...\E2\\E,\\I\ = \EJ...\E2\\E,\ and so |AJ5| = \E„...E2E,B\ = \EJ . . . lE^WE^WB] = |A| |B| 8.23. Prove Theorem 8.7: Let A = (a«); then \A\ = anAn + ttizAia + • • • + ai„Ai„, where Aij is the cofactor of an. Each term in \A\ contains one and only one entry of the ith row (aij.Ojg, . . ., a,„) of A. Hence we can write \A \ in the form |A| = ajiAfi + ai2A*2 + ■■■ + ai„Af„ (Note Ay is a sum of terms involving no entry of the ith row of A.) Thus the theorem is proved if we can show that At. = A;,. = (-l)«+i|M„I where Afy is the matrix obtained by deleting the row and column containing the entry ay. (His- torically, the expression Ay was defined as the cofactor of Oy, and so the theorem reduces to showing that the two definitions of the cofactor are equivalent.) 186 DETERMINANTS [CHAP. 8 First we consider the case that i = n, j = n. Then the sum of terms in \A\ containing a„„ is Orm'^nn = «nn 2 (sgn a) ffli<r(i) 02<t(2) • • • <»n-l,cr(n-l) a where we sum over all permutations aSS„ for which ain) = n. However, this is equivalent (Prob- lem 8.63) to summing over all permutations of {1, . . .,n-l}. Thus A*„ = |M„„| = (-!)»+« 1M„„| . Now we consider any i and }. We interchange the ith. row with each succeeding row until it is last, and we interchange the jth column with each succeeding column until it is last. Note that the determinant |Afy| is not affected since the relative positions of the other rows and columns are not affected by these interchanges. However, the "sign" of |A| and of Ay is changed n — i and then n — j times. Accordingly, A% = (-l)»-i + »-i |M„| = (-l)* + MMy| 8.24. Let A = (an) and let B be the matrix obtained from A by replacing the ith row of A by the row vector (bn, . . . , &m). Show that |B| = biiAn + bt2Aa + • • • 4- &i„Ai„ Furthermore, show that, for j ¥= i, ajiAn + (lizAti + • • • + ttjnAin — and aijAii + 023^2! + • • • 4- a„iA„i = Let B = (6y). By the preceding problem, \B\ = ftiiBa + 6i2^i2 + ■ • • + 6i„Bi„ Since By does not depend upon the ith row of B, By = Ay for j = 1 n. Hence \B\ = ftjiAji + 6i2Ai2 + • • • + 6i„Ai„ Now let A' be obtained from A by replacing the ith row of A by the jth row of A. Since A' has two identical rows, |A'| = 0. Thus by the above result, |A'| = ajiAji + aj2-^i2 + • • • + O-jn^ln = Using |A*| — \A\, we also obtain that Oti-^ii + «2j-^2i + •" • + «ni-Ani = 0. 8.25. Prove Theorem 8.8: A-(adjA) = (adjA)-A = |A|/. Thus if |AI ^ 0, A-* = (l/|Al)(ad3A). Let A = (ay) and let A • (adj A) = (fty). The ith row of A is («ii. ffliz. • • • . <*»tl) '^) Since adj A is the transpose of the matrix of cofactors, the ith column of adj A is the transpose of the cofactors of the jth row of A: (A,i,Aj2, ...,A,„)« (2) Now by, the ij-entry in A • (adj A), is obtained by multiplying (1) and (2): 6y = a^Aji + Uj^Ajz + • • • + ttin'Ajn Thus by Theorem 8.7 and the preceding problem, " \ if i ^ i Accordingly, A -(adj A) is the diagonal matrix with each diagonal element \A\. In other words, A • (adj A) = \A] I. Similarly, (adj A)-A = \A\ 1. CHAP. 8] DETERMINANTS 187 8.26. Prove Theorem 8.9: The system of linear equations Ax = b has a unique solution if and only if A = ]A| ^ 0. In this case the unique solution is given by Xi = Ai/A, X2 = A2/A, . . . , Xn = An/A. By preceding results, Ax = b has a unique solution if and only if A is invertible, and A is invertible if and only if A = |A| # 0. Now suppose A#0. By Problem 8.25, A-i = (1/A)(adj A). Multiplying Ax = b by A -1, we obtain X = A-^Ax = (1/A)(adj A)b (1) Notethattheithrowof (l/A)(adjA) is (l/A)(Aii, A^, . . ., A„j). If 6 = (61, 63, • • ., &„)' then, by (i), Xi = (1/A)(6i Aii + b^A^i +■■■+ 6„A J However, as in Problem 8.24; 6iAii + 62^21 + • • • + 6„A„j = A; the determinant of the matrix obtained by replacing the ith column of A by the column vector 6. Thus Xi = (l/A)Aj, as required. 8.27. Suppose P is invertible. Show that |P-i] = |P|-i. P-^P = I. Hence 1 = |/| = |p-ip| = |p-i| |p|, and so |P-i| = |P|-i. 8.28. Prove Theorem 8.11: Suppose A and S are similar matrices. Then |A| = |B|, Since A and B are similar, there exists an invertible matrix P such that B = P-^AP. Then by the preceding problem, |P| = |P-iAP| = |P-i| \A\ \P\ = \A\ lP-i| \P\ - \A\ . We remark that although the matrices P-i and A may not commute, their determinants |P-i| and \A\ do commute since they are scalars in the field K. 8.29. Prove Theorem 8.13: There exists a unique function D.cA^K such that (i) D is multilinear, (ii) D is alternating, (iii) D(/) = 1. This function D is the determinant function, i.e. D{A) = |A|. Let D be the determinant function: D(A) = \A\. We must show that D satisfies (i), (ii) and (iii), and that D is the only function satisfying (1), (ii) and (iii). By preceding results, D satisfies (ii) and (iii); hence we need show that it is multilinear. Suppose ■A = (««) = (Ai, Ag, . . ., A„) where A^ is the fcth row of A. Furthermore, suppose for a fixed i, Aj = Bi + Cj, where Bi = (b^, . . . , 6„) and Q = (ci, . . . , c„) Accordingly, a^ = 61 + Cj, ajj =63 + 02, ..., «;„ = 6„ + c„ Expanding D(A) - \A\ by the ith row, D(A) = D(A„ . . ., Bi + Ci, . . ., A„) = aaA^, + at^A^^ + ■■■ + Ui^A^, = (61 + ci)Aii + (62 + C2)A;2 + . . . + (6„ + c„)Ai„ = (fciAii + 62Ai2 + • • • + 6„Ai„) + (ciA(i + C2A,.2 + • ■ • + c^Ai^) However, by Problem 8.24, the two sums above are the determinants of the matrices obtained from A by replacing the ith row by Bj and Q respectively. That is, D(A) = I>(A„...,Bi + Ci, ...,A„) = I>(Ai, ...,Bi, ...,A„) + Z)(Ai, ...,Ci, ...,A„) Furthermore, by Theorem 8.3(i), Z)(Ai, ...,fcAj A„) = fcD(A„...,Ai, ...,A„) Thus B is multilinear, i.e. D satisfies (iii). 188 DETERMINANTS [CHAP. 8 We next must prove the uniqueness of D. Suppose D satisfies (i), (ii) and (iii). If {e^, . . . , e„} is the usual basis of K", then by (iii), Z>(ei, eg, . . ., e„) = D{I) = 1. Using (ii) we also have (Problem 8.73) that Z)(ejj,ei2. •■■'%) = sgna, where a = in^ . . . i^ (D Now suppose A = (ay). Observe that the fcth row A^^ of A is ^k = («ki. «fc2. ■■■> "fcn) = «fci«i + "k2«2 + ■ • • + afc„e„ Thus I'(A) = I>(aiiei + • • • + ai„e„, 02161 + • • • + a2„e„, . . . , o„iei + • • ■ + a„„e„) Using the multilinearity of D, we can write D(A) as a sum of terms of the form = 2 ("Hj a2i2 • • • '^"iJ ^'•\' %• ■■■' K^ where the sum is summed over all sequences 11^2 • ■ ■ in where i^ G {1, . . . , n}. If two of the indices are equal, say ij = i^ but j ¥' k, then by (ii), ^(%. %, ■•■'\) = Accordingly, the sum in (2) need only be summed over all permutations a = i^i^ ■ ■ . ■J„. Using (1), we finally have that D{A) = 2 («iii "212 •• • «ni„) D{ei^, Bj^, . . . , ejj = 2 (sgn a) a„^ a2i^ . . . a„i^, where a - i^i2 ...in Hence D is the determinant function and so the theorem is proved. PERMUTATIONS 8.30. Determine the parity of a = 542163. Method 1. We need to obtain the number of pairs (i, j) for which i> j and i precedes 3 in a. There are: 3 numbers (5, 4 and 2) greater than and preceding 1, 2 numbers (5 and 4) greater than and preceding 2, 3 numbers (5, 4 and 6) greater than and preceding 3, 1 number (5) greater than and preceding 4, numbers greater than and preceding 5, numbers greater than and preceding 6. Since 3 + 2+3 + 1 + + = 9 is odd, a is an odd permutation and so sgn a = -1. Method 2. Transpose 1 to the first position as follows: ^42163 to 154263 Transpose 2 to the second position: 154263 to 125463 Transpose 3 to the third position: 125463 to 123546 Transpose 4 to the fourth position: 1 2 sT^ 6 to 123456 Note that 5 and 6 are in the "correct" positions. Count the number of numbers "jumped": 3 + 2 + 3 + 1 = 9. Since 9 is odd, a is an odd permutation. (Remark: This method is essentially the same as the preceding method.) CHAP. 8] DETERMINANTS 189 Method 3. An interchange of two. numbers in a permutation is equivalent to multiplying the permutation by a transposition. Hence transform a to the identity permutation using transpositions; such as. 5 4 2^1 6 3 X 14 2 5 6 3 X 1 2 4 5 6/3 1 2 S^ 6^4 X 12 3 4 6 5 X 12 3 4 5 6 Since an odd number, 5, of transpositions was used, a is an odd permutation. 8.31. Let a = 24513 and t = 41352 be permutations in Ss. Find (i) the composition per- mutations to„ and oroT, (ii) a~^. Recall that a = 24513 and t = 41352 are short ways of writing _ /I 2 3 4 5\ ""(^24513/ and T _ /I 2 3 4 5N ~ V4 1 3 5 2/ which means <,(1) = 2, <t(2) = 4, <r(3) = 5, (4) = 1 and <r(5) = 3 and r(l) = 4, r(2) = 1, r(3) = 3, T (4) = 5 and t(5) = 2 (i) 12 3 4 5 12 3 4 5 » 1 4, J, j i ^/ ^/ ^^ ^^ ^r 2 4 5 13 and 4 13 5 2 T j 4 1 1 1 .. i i i i i 15 2 4 3 12 5 3 4 Thus to<t = 15243 and a°T = 12534. (ii) Vl 2 3 1 3\ 4 5/ = /I 2 3 4 5\ V4 1 5 2 3/ That is, ff-i = 41523. 8.32. Consider any permutation <7 = jiji . . . jn. Show that for each pair (i, k) such that i> k and i precedes kin a there is a pair (i*, k*) such that i* < k* and cr(i*) > <r(fc*) (i) and vice versa. Thus cr is even or odd according as to whether there is an even or odd number of pairs satisfying (1). Choose i* and A;* so that a(i*) = i and a{k*) = fc. Then i > fe if and only if <r(i*) > a(fe*), and i precedes k in a it and only if i* < k*. 190 DETERMINANTS [CHAP. 8 8.33. Consider the polynomial g - g{xi, ...,Xn) = y[{Xi-Xj). Write out explicitly the polynomial g — g{xi, X2, Xs, xt). '"^^ The symbol 11 is used for a product of terms in the same way that the symbol 2 is used for a sum of terms. That is, Yl (a'i ~ ""j) means the product of all terms (Kj — Xj) for which i < j. Hence i<i 9 - g{Xl,...,Xi) = iXi — X2){Xi — Xa)(Xi — Xi)(X2-X3)(X2-Xi)(X3 — X^) 8.34. Let (T be an arbitrary permutation. For the polynomial g in the preceding problem, define <T{g) = Yl (^-^c" ~ ^(tw)- Show that 'i<3 { g if o- is even [—g it a IS odd Accordingly, aig) = (sgn <j)g. Since a is one-one and onto, a{g) = n (a'.ra) - a'.ru)) = . .11. ,{xi-Xj) to" KjorOj Thus a{g) — g or a(g) = —g according as to whether there is an even or an odd number of terms of the form («; - Xj) where i > j. Note that for each pair (i, j) for which i < } and 17(1) > <r{j) (1) there is a term (ajo-fj, - x^^j-^) in a(g) for which a(i) > a(j). Since a is even if and only if there is an even number of pairs satisfying (1), we have a(g) = g it and only if a is even; hence <r{g) = -g if and only if a is odd. 8.35. Let u,tG Sn. Show that sgn (to a) = (sgn T)(sgn <t). Thus the product of two even or two odd permutations is even, and the product of an odd and an even permutation is odd. Using the preceding problem, we have sgXi-{r °a)g = {T°c)(g) = T(CT(flr)) = T((sgn <r)sr) = (sgn T)(sgn <T)sf Accordingly, sgn(TO<r) = (sgn T)(sgn <r). 8.36. Consider the permutation a = J1J2 . . . jn. Show that sgn (7-1 = sgn <t and, for scalars " ttj^i aj^2 . . . ttj^n = aik^chk^ ■ • ■ Omk^ where <7~^ = kiki . . . kn We have a-^°<T = c, the identity permutation. Since e is even, cri and a are both even or both odd. Hence sgn <r-i = sgn a. Since a - JiJz ■■■in is a permutation, aj^i Oj^a • • • a,j^n = "iki "■zkz ■ ■ ■ »nk„- Then fej, k^, ■..,k„ have the property that ff(fci) = 1, ^(fea) = 2, . . ., "(K) = n Let T = ^1^2 ■ ■ ■ kn- Then for i = 1, . . . , w, (<j°T)(i) = a{T(i)) = a(fcj) = i Thus a°T — e, the identity permutation; hence t = ct~i. CHAP. 8] DETERMINANTS 191 MISCELLANEOUS PROBLEMS 8.37. Find det (T) for each linear operator T: (i) T is the operator on R^ defined by T{x, y, z) — {2x — z,x + 2y- 4:Z, Sx-3y + z) (ii) T is the operator on the vector space V of 2-square matrices over K defined by c d, 1 2 0-1' (i) Find the matrix representation of T relative to, say, the usual basis: [r] = I 1 2—4 T(A) = MA where M 1 3 -3 1/ Then det (T) = 2 0-1 12-4 3-3 1 2(2-12) - l(-3-6) = (ii) Find a matrix representation of T in some basis of V, say, Then T(Ei) = T(E^) = nE^) = T{Ei) = 1 E. = 1 £, = E^ = -11 a b\/l c dj\0 a 6\/0 1 c dj\0 a b\/0 c d)\i a b\/0 e djio 1 a c a c 'b ^d b d ^1 0/' "■* VO 1, = aEi + 0^2 + cE^ + OE4 = 0iS7i + aE^ + OS3 + cE^ = bEi + OE2 + dEs + OEi = 0^1 + bE2 + OE3 + dEi Thus [T]e = det (T) = 'a c 0\ a c b d ^0 b dj a c a c 6 d 6 d and a c a e = a d + c 6 b d b d - a^d^ + bH"^ 2abcd /111 8.38. Find the inverse of A = ( 1 1 .0 1, The inverse of A is of the form (Problem 8.53): A-» = 1 z Set AA-'^ — l, the identity matrix: * ' Ai4-i = Set corresponding entries equal to each other to obtain the system a;-|-l = 0, 2/-t-z-|-l = 0, z + l = = / 192 DETERMINANTS [CHAP. 8 /l-l 0\ The solution of the system is a; = —1, y = 0, z = —1. Hence A ^ — \ 1 — 1 . \0 1/ A~^ could also be found by the formula A-> = (adj i4.)/|A| . 8.39. Let D be a 2-linear, alternating function. Show that D{A, B) = - D{B, A). More generally, show that if D is multilinear and alternating, then D{...,A, ...,B, ...) = -D{...,B, ...,A, ...) that is, the sign is changed whenever two components are interchanged. Since D is alternating, D{A + B,A+B) = 0. Furthermore, since D is multilinear, = D{A+B,A+B) = D(A,A + B) + D(B,A + B) = D{A, A) + D(A, B) + D(B, A) + D{B, B) But D(A, A) = and I>(B, B) = 0. Hence = D(A, B) + D(B, A) or D{A, B) = - D(B, A) Similarly, = D(. . ., A + B, . . ., A+ B, . . .) = D(,...,A A,...) + D(....,A, ...,B, ...) + D(...,B,...,A, ...) + D{...,B....,B, ...) = D(...,A, ...,B, ...) + D(...,B, ...,A, ...) and thus D(...,A B,...) = -D(...,B A,...). 8.40. Let V be the vector space of 2 by 2 matrices M = ( , j over R. Determine whether or not D.V^R is 2-linear (with respect to the rows) if (i) D{M) = a + d, (ii) D{M) = ad. (i) No. For example, suppose A = (1, 1) and B - (3, 3). Then D(A,B) = d(\ 3) = 4 and D(2A.B) - C^^ 3) = 5 # 2D(A, B) (ii) Yes. Let A = (tti, a^, B = (61, 63) and C = (cj, Ca); then D(A,C) = dT' "^) = 01C2 and D(B,C) = ■d(^' ^') = V2 Hence for any scalars s, < G R, D(sA + tB,C) = i?/«»i + **^ sa,+ tb,^^ ^ ^sa, + th,)c, = s(aiC2) + t(6iC2) = 8D(A, C) + tD{B, C) That is, D is linear with respect to the first row. Furthermore, D(C,A) = d(^; I'J = c,a, and D(C, B) = D^^^ ^ = c,b. Hence for any scalars s, t S R, D(C,sA + tB) ^ d( "J-., „?..)= Ci(sa2+t62) = s(cia2) + t(ci62) = sI>(C,A) + «D(C,B) That is, D is linear with respect to the second row. Both linearity conditions imply that D is 2-linear. CHAP. 8] DETERMINANTS 193 Supplementary Problems COMPUTATION OF DETERMINANTS 8.41. Evaluate the determinant of each matrix: (i) 8.42. Compute the determinant of each matrix: (i) 2 5 4 1 t-2 (ii) 6 1 3 -2 -3 \ .... /t-5 7 4 t-l)' <"> -1 ( + 3 8.43. For each matrix in the preceding problem, find those values of t for which the determinant is zero. 8.44. Compute the determinant of each matrix: /2 1 l\ /3 -2 -4\ (i) 5 -2 , (ii) 2 5 -1 , (iii) \1 -3 4/ \0 6 1/ 8.45. Evaluate the determinant of each matrix: (i) /-2 -1 4 f - 1 3 -3 (ii) I -3 t + 5 -3 -6 6 t-4,1 8.46. For each matrix in the preceding problem, determine those values of t for which the determinant is zero. 8.47. Evaluate the determinant of each matrix: (i) /l 2 2 3 10-20 \4 -3 2) COF ACTORS, CLASSICAL ADJOINTS, INVERSES 'l 2-2 3^ 8.48. For the matrix I „ „ I, find the cof actor of: 4 2 1 ,1 7 2 -3^ (i) the entry 4, (ii) the entry 5, (iii) the entry 7. 8.49. Let A Find (i) adj A, (ii) A-i. 2 1 3 2\ 3 1-2 3-1 1-21' ^"^ I 1 -1 4 3 l2 2 -1 1/ '1 2 2' 8.50. Let A = f 3 1 a 1 1, Find (i) adj A, (ii) A-i. 8.51. Find the classical adjoint of each matrix in Problem 8.47. 8.52. Determine the general 2 by 2 matrix A for which A = adj A. 8.53. Suppose A is diagonal and B is triangular; say, ' di ... 02 ... A = and B ,0 62 lO K j 194 DETERMINANTS [CHAP. 8 (i) Show that adj A is diagonal and adj B is triangular. (ii) Show that B is invertible iff all 6; ¥= 0; hence A is invertible iff all «{ ^ 0. (iii) Show that the inverses of A and B (if either exists) are of the form t-i ... \ /&r* di2 ■•• dm 62 ... d2n A-1 = |0 ar^ ••■ I 5_, ^ lO ... a- ' \o That is, the diagonal elements of A-i and B-^ are the inverses of the corresponding diagonal elements of A and B. DETERMINANT OF A LINEAR OPERATOR 8.54. Let T be the linear operator on R^ defined by T{x, y, z) = (3a; - 2z, 5y + 7z,x + y + z) Find det (T). 8.55. Let DiV^V be the differential operator, i.e. D{v) = dv/dt. Find det (D) if V is the space gen- erated by (i) {l,t, .... t"}, (ii) {e*, e^*, e^t}, (iii) {sin t, cos t}. 8.56. Prove Theorem 8.12: Let T and S be linear operators on V. Then: (i) det (S°T) = det(S)'det(r); (ii) T is invertible if and only if det (7) 9^ 0. 8.57. Show that: (i)det(lv) = l where ly is the identity operator; (ii) det (T-i) = det(r)-i if T is invertible. DETERMINANTS AND LINEAR EQUATIONS ,.. (Sx + 5y = 8 ,.., f2x-Sy = -1 8.58. Solve by determinants: (1) i , „ , (") i . , „ . • [ix — 2y = l I4x + 7y = —1 (2x-5y + 2z = 7 (2z + 3 = y + Sx 8.59. Solve by determinants: (i) } x + 2y - Az = 3 (ii) ■< x - Sz = 2y + 1. [sx-iy-Gz = 5 [sy + z = 2 - 2x 8.60. Prove Theorem 8.10: The homogeneous system Ax = has a nonzero solution if and only if A = IA| == 0. PERMUTATIONS 8.61. Determine the parity of these permutations in Sg: (i) a = 3 2 1 5 4, (ii) r = 1 3 5 2 4, (iii) y = 4 2 5 3 1. 8.62. For the permutations <r, t and v in Problem 8.61, find (i) r°c, (ii) Tr°<r, (iii) cr-i, (iv) t-i. 8.63. Let T e 5„. Show that t°<t runs through S„ as a runs through S„; that is, S„ = {t « a : a e iSf„}. 8.64. Let aeS„ have the property that <t(w) = n. Let a* e S„_i be defined by <f*(x) = <r(x). (i) Show that sgn <r* = sgn a. (ii) Show that as a runs through S„, where cr{n) = n, c* runs through S„_i; that is, S„_i = {a* : <r e S„, <T(m) = n}. MULTILINEARITY 8.65. Let V = (K"")"", i.e. V is the space of m-square matrices viewed as m-tuples of row vectors. Let D:V-^K. (i) Show that the following weaker statement is equivalent to D being alternating: i?(Ai,A2, ...,A„) = whenever Aj = Ai+i for some i. (ii) Suppose D is m-linear and alternating. Show that if A^.Az, ■■■,A^ are linearly dependent, then D(Ai,...,AJ = 0. CHAP. 8] DETERMINANTS 195 8.66. Let V be the space of 2 by 2 matrices M = ( j over R. Determine whether or not D:V->^K is 2-linear (with respect to the rows) if (i) D{M) = ac-hd, (ii) D{M) = ah- ed, (iii) D{M) = 0, (iv) D{M) = 1. 8.67. Let V be the space of M-square matrices over K. Suppose B B V is invertible and so det (B) ¥= 0. Define Z> : V -* X by Z?(A) = det (Afi)/det (B) where A G V. Hence fl(Ai, Ag, . . . , A„) = det (AiB, A^B, .... A„B)/det (B) where Aj is the tth row of A and so A^B is the ith row of AB. Show that Z> is multilinear and alternating, and that !>(/) = 1. (Thus by Theorem 8.13, D{A) - det (A) and so det (AB) = det (A) det (B). This method is used by some texts to prove Theorem 8.5, i.e. |Ai5| = \A\ |B|.) MISCELLANEOUS PROBLEMS 8.68. Let A be an w-square matrix. Prove IfeAl = fc" \A\ . 8.69. Prove: 1 x^ x\ ... x\~^ '2 ■ ■ ■ X^ 1 X2 xf ~"-l 1 a:„ xl ... «»-» The above is called the Vandermonde determinant of order n. 8.70. Consider the block matrix M = f j where A and C are square matrices. Prove \M\ = \A\ \C\. More generally, prove that if M is a triangular block matrix with square matrices Aj, . . ., A^ on the diagonal, then \M\ = |Ai| [Agl • • "l^ml- 8.71. Let A, B, C and D be commuting m-square matrices. Consider the 2»-square block matrix ^ " (c d)- P'^^^^tl'** W = \A\\D\-\B\\C\. 8.72. Suppose A is ortfeoflroMttZ, that is, A^A = I. Show that |A| = ±1. 8.73. Consider a permutation a = Jiij • • • in- Let {e<} be the usual basis of X», and let A be the matrix whose tth row is e^., i.e. A = (e,-^, e^-^, . . ., e^^). Show that |A| = sgn a. 8.74. Let A be an M-square matrix. The determinantal rank of A is the order of the largest submatrix of A (obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal rank of A is equal to its rank, i.e. the maximum number of linearly independent rows (or columns). Answers to Supplementary Problems 8.41. (i) -18, (ii) -15. 8.42. (i) t2 - 3t - 10, (ii) t^ - 2t - 8. 8.43. (i) t = 5, t- -2; (ii) t - i, t - -2. 8.44. (i) 21, (ii) -11, (iii) 100, (iv) 0. 196 DETERMINANTS [CHAP. 8 8.45. (i) (< + 2)(t-3)(t-4), (li) (t + 2)2(t-4), (iii) (t + 2)2(f-4). 8.46. (i) 3, 4, -2; (ii) 4, -2; (iii) 4, -2. 8.47. (i) -131, (ii) -55. 8.48. (i) -135, (ii) -103, (iii) -31. 8.49. adj A = -1 1 -1 , A-i = (adj A)/|A1 = i -^ | \ 2 -2 0/ \-l 1 0, 1 -2\ /-I 2\ 8.50. adj A = | -3 -1 6 , A-i =31 2 1 -5/ \-2 -1 5/ (-16 -29 -26 -2\ / 21 -14 -17 -19^ -30 -38 -16 29 1 ,... I -44 H 33 11 -8 51 -13 -1 (") -29 1 13 21 -13 1 28 -18/ \ 17 7 -19 -18; /k 8.52. A = ( „ , 8.54. det(r) = 4. 8.55. (i) 0, (ii) 6, (iii) 1. 8.58. (i) X = 21/26, y = 29/26; (ii) x = -5/13, y = 1/13. 8.59. (i) X = 5, y — 1, z = 1. (ii) Since A = 0, the system cannot be solved by determinants. 8.61. agn a = 1, agn t = —1, sgn v = —1. 8.62. (i) T°v = 53142, (ii) ir°(r = 52413, (iii) <r-i = 32154, (iv) t-i = 14253. 8.66. (i) Yes, (ii) No, (iii) Yes, (iv) No. chapter 9 Eigenvalues and Eigenvectors INTRODUCTION In this chapter we investigate the theory of a single linear operator T on a vector space V of finite dimension. In particular, we find conditions under which T is diago- nalizable. As was seen in Chapter 7, this question is closely related to the theory of similarity transformations for matrices. We shall also associate certain polynomials with an operator T: its characteristic polynomial and its minimum polynomial. These polynomials and their roots play a major role in the investigation of T. We comment that the particular field K also plays an im- portant part in the theory since the existence of roots of a polynomial depends on K. POLYNOMIALS OF MATRICES AND LINEAR OPERATORS Consider a polynomial f{t) over a field K: f(t) = Ont" + • - • + ait + Oo. If A is a square matrix over K, then we define /(A) = a„A" + • • • + ttiA + ao/ where / is the identity matrix. In particular, we say that A is a root or zero of the poly- nomial fit) if /(A) = 0. Example 9.1: Let A = ( V and let /(t) = 2t2 - 3t + 7, g(t) = t^ - 5t - 2. Then ^<l «^)-i:!r-<3:)-a;) = a^;:: 1 2V -/I 2\ „/l 0\ /O and .(A) = ^3 J -5^3 ^^-2^^ ^^ ^^ ^ Thus A is a zero of g(t). The following theorem applies. Theorem 9.1: Let / and g be polynomials over K, and let A be an w-square matrix over K. Then (i) (/ + flr)(A) = /(A)+flr(A) (ii) {fg)(A) = f(A)giA) and, for any scalar k G K, (iii) {kf){A) = kf{A) Furthermore, since f{t) g{t) = g{t) f{t) for any polynomials f{t) and g{t), f{A)g{A) = giA)f{A) That is, any two polynomials in the matrix A commute. 197 198 EIGENVALUES AND EIGENVECTORS [CHAP. 9 Now suppose T :V -^ V is a linear operator on a vector space V over K. If f{t) — cut" + ■ • • + ait + do, then we define f{T) in the same way as we did for matrices: f{T) = a^T" + ■•■ +aiT + aoI where / is now the identity mapping. We also say that T is a zero or root of f(t) if f{T) = 0. We remark that the relations in Theorem 9.1 hold for operators as they do for matrices; hence any two polynomials in T commute. Furthermore, if A is a matrix representation of T, then /(A) is the matrix representation of f(T). In particular, f{T) = if and only if /(A) = 0. EIGENVALUES AND EIGENVECTORS Let T -.V -*V be a linear operator on a vector space V over a field K. A scalar X G K is called an eigenvalue of T if there exists a nonzero vector v eV for which T{v) = \v Every vector satisfying this relation is then called an eigenvector of T belonging to the eigenvalue A. Note that each scalar multiple kv is such an eigenvector: T{kv) = kT{v) = k(\v) = \{kv) The set of all such vectors is a subspace of V (Problem 9.6) called the eigenspace of \. The terms characteristic value and characteristic vector (or: proper value and proper vector) are frequently used instead of eigenvalue and eigenvector. Example 9.2: Let I.V^V be the identity mapping. Then, for every vGV, I{v) = v = Iv. Hence 1 is an eigenvalue of /, and every vector in V is an eigenvector belonging to 1. Example 9.3 : Let 7" : R^ ^ R2 be the linear operator which rotates each vector v GB? by an angle = 90°. Note that no nonzero vector is a multiple of itself. Hence T has no eigen- values and so no eigenvectors. Example 9.4: Let D be the differential operator on the vector space V of diflferentiable functions. We have £)(e*') = 5e^*. Hence 5 is an eigenvalue of D with eigenvector e^'. If A is an w-square matrix over K, then an eigenvalue of A means an eigenvalue of A viewed as an operator on K". That is, \gK is an eigenvalue of A if, for some nonzero (column) vector v G X", Av = \v In this case v is an eigenvector of A belonging to A. such that AX = tX: Example 9.5: Find eigenvalues and associated nonzero eigenvectors of the matrix A = We seek a scalar t and a nonzero vector X = ( \ i DO = •(: The above matrix equation is equivalent to the homogeneous system X + 2y = tx ( {t-l)x- 2y = , or -s [3x + 2y = ty \-Zx + (t-2)y = 1 2 3 2 (i) CHAP. 9] EIGENVALUES AND EIGENVECTORS 199 Recall that the homogeneous system has a nonzero solution if and only if the de- terminant of the matrix of coefficients is 0: t-1 -2 -3 «- 2 t2 - 3t - 4 = {t-4){t+l) = Thus t is an eigenvalue of A if and only if t = 4 or t = —1. Setting t = 4 in (1), 3x ~ 2y = . , or simply 3x — 2y = -Sx + 2y = Thus V = { ) = ( „ ) is a nonzero eigenvector belonging to the eigenvalue t = 4, \V/ \3/ and every other eigenvector belonging to < = 4 is a multiple of v. Setting t = -l in (1), -2x - 2y = . , or simply x + y = -3x — 3y = Thus w = I ) — { 1 ) is a nonzero eigenvector belonging to the eigenvalue t = —1, and every other eigenvector belonging to t = —1 is a multiple of w. The next theorem gives an important characterization of eigenvalues which is fre- quently used as its definition. Theorem 9.2: Let T:V -^V be a linear operator on a vector space over K. Then XGK is an eigenvalue of T if and only if the operator Xl — T is singular. The eigenspace of A is then the kernel of XI — T. Proof. X is an eigenvalue of T if and only if there exists a nonzero vector v such that T{v) = XV or {Xl){v)-T{v) = or {Xl-T){v) = i.e. Xl — T is singular. We also have that v is in the eigenspace of X if and only if the above relations hold; hence v is in the kernel of XI — T. We now sta+2 a very useful theorem which we prove (Problem 9.14) by induction: Theorem 9.3: Nonzero eigenvectors belonging to distinct eigenvalues are linearly independent. Example 9.6: Consider the functions e°i', 6°^', . . ., e''n' where aj, . . .,a„ are distinct real numbers. If D is the differential operator then D(e°i'') = a^e'^''*. Accordingly, e^i', ..., e°»' are eigenvectors of D belonging to the distinct eigenvalues ai, . . . , a„, and so, by Theorem 9.3, are linearly independent. We remark that independent eigenvectors can belong to the same eigenvalue (see Problem 9.7). DIAGONALIZATION AND EIGENVECTORS Let T:V -^ V be a linear operator on a vector space V with finite dimension n. Note that T can be represented by a diagonal matrix 'A;i ... k2 ... ,0 ... kni 200 EIGENVALUES AND EIGENVECTORS [CHAP. 9 if and only if there exists a basis [vi, . . .,v„} of V for which T{vi) = kivi T{v2) — kiVi T{V„) - knVn that is, such that the vectors vi, ■ ■ .,Vn are eigenvectors of T belonging respectively to eigen- values ki, . . ., fen. In other words: Theorem 9.4: A linear operator T : V -» V can be represented by a diagonal matrix B if and only if V has a basis consisting of eigenvectors of T. In this case the diagonal elements of B are the corresponding eigenvalues. We have the following equivalent statement. Alternate Form of Theorem 9.4: An w-square matrix A is similar to a diagonal matrix B if and only if A has n linearly independent eigen- vectors. In this case the diagonal elements of B are the corresponding eigenvalues. In the above theorem, if we let P be the matrix whose columns are the n independent eigenvectors of A, then B = P~^AP. Example 9.7: Consider the matrix A = ( j . By Example 9.5, A has two independent /2\ / 1\ /2 1\ „_, /1/5 1/5 eigenvectors (^^j and (^_J. Set P = {^^ _^j , and so P i - (^^^^ _^^^ Then A is similar to the diagonal matrix B = P-^AP = /1/5 l/5\/l 2\/2 1\ ^ /4 0\ 1^3/5 -2/5/^3 2/1^3 -1/ ~ \0 -l) As expected, the diagonal elements 4 and —1 of the diagonal matrix B are the eigen- values corresponding to the given eigenvectors. CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM Consider an %-square matrix A over a field K: (0,11 ai2 . . . ain Chl ffl22 . . . fflZn ttnl ttre2 . • ■ flnn The matrix tin — A, where /„ is the n-square identity matrix and * is an indeterminant, is called the characteristic matrix of A: (t — an — ffli2 . . . —ain ~tt21 t — a22 . . . —a2n — ftnl — ttn2 ... t aim Its determinant AA{t) — det {tin — A) which is a polynomial in t, is called the characteristic polynomial of A. We also call AA{t) = det (tin -A) = the characteristic equation of A. CHAP. 9] EIGENVALUES AND EIGENVECTORS 201 Now each term in the determinant contains one and only one entry from each row and from each column; hence the above characteristic polynomial is of the form AA{t) - {t- an){t - 022) ••■(*- ann) + terms with at most n — 2 factors of the form t — an Accordingly, AaC*) = t" — (au + a22+ • • • +aTO)t"~^ + terms of lower degree Recall that the trace of A is the sum of its diagonal elements. Thus the characteristic polynomial Aa(*) = det (i/„ — A) of A is a monic polynomial of degree n, and the coefficient of i"~^ is the negative of the trace of A. (A polynomial is monic if its leading coefficient is 1.) Furthermore, if we set i = in Aa(<), we obtain Aa(0) = \-A\ = (-1)"[A| But Aa(0) is the constant term of the polynomial AaC*). Thus the constant term of the char- acteristic polynomial of the matrix A is (—1)" \A\ where n is the order of A. 13 0^ Example 9.8: The characteristic polynomial of the matrix A = ( — 2 2—1 4 -21 ^(t) = \tI-A\ = * - 1 -3 2 t-2 1 -4 t + 2 = «3 - <2 + 2« + 28 As expected, A(t) is a monic polynomial of degree 3. We now state one of the most important theorems in linear algebra. Cayley-Hamilton Theorem 9.5: Every matrix is a zero of its characteristic polynomial. /I 2^ Example 9.9: The characteristic polynomial of the matrix A = I V o Ji A(t) = \tI-A\ = t- 1 -2 -3 t-2 = t2 - 3t As expected from the Cayley-Hamilton theorem, A is a zero of A{t): '<^'^G iJ-'Q i)-'Q °) = (: :) The next theorem shows the intimate relationship between characteristic polynomials and eigenvalues. Theorem 9.6: Let A be an n-square matrix over a field K. A scalar \GK is an eigen- value of A if and only if A is a root of the characteristic polynomial A(t) of A. Proof. By Theorem 9.2, A is an eigenvalue of A if and only if XI — A is singular. Furthermore, by Theorem 8.4, XI — A is singular if and only if |a7 — A| = 0, i.e. A is a root of A(t). Thus the theorem is proved. Using Theorems 9.3, 9.4 and 9.6, we obtain Corollary 9.7: If the characteristic polynomial A(t) of an ^-square matrix A is a product of distinct linear factors: 202 EIGENVALUES AND EIGENVECTORS [CHAP. 9 A{t) = {t- ai){t - aa) • • • (t - an) i.e. if ai, . . . , On are distinct roots of A{t), then A is similar to a diagonal matrix whose diagonal elements are the ok. Furthermore, using the Fundamental Theorem of Algebra (every polynomial over C has a root) and the above theorem, we obtain Corollary 9.8: Let A be an w-square matrix over the complex field C. Then A has at least one eigenvalue. Example 9.10: Let A = 2 — 5 . Its characteristic polynomial is A(t) = We consider two cases: t - 3 «- 2 5 -1 t + 2 = («-3)(t2 + l) (i) A is a matrix over the real field R. Then A has only the one eigenvalue 3. Since 3 has only one independent eigenvector, A is not diagonalizable. (ii) A is a matrix over the complex field C. Then A has three distinct eigenvalues: 3, i and —i. Thus there exists an invertible matrix P over the complex field C for which /3 0^ P-iAP = i \0 -i, i.e. A is diagonalizable. Now suppose A and B are similar matrices, say B = P^^AP where P is invertible. We show that A and B have the same characteristic polynomial. Using tl — P~^tIP, \tI-B\ = \tI-P-'AP\ = \P-HIP - P-'AP\ = \P-mi-A)P\ =■ \P-^\\tI-A\\P\ Since determinants are scalars and commute, and since |P~i| |P| = 1, we finally obtain \tI-B\ = \tI-A\ Thus we have proved Theorem 9.9: Similar matrices have the same characteristic polynomial. MINIMUM POLYNOMIAL Let A be an w-square matrix over a field K. Observe that there are nonzero polynomials f{t) for which f{A) — 0; for example, the characteristic polynomial of A. Among these polynomials we consider those of lowest degree and from them we select one whose leading coefficient is 1, i.e. which is monic. Such a polynomial m{t) exists and is unique (Problem 9.25); we call it the minimum polynomial of A. Theorem 9.10: The minimum polynomial m{f) of A divides every polynomial which has A as a zero. In particular, m{t) divides the characteristic polynomial A(t) of A. There is an even stronger relationship between m{t) and A(i). CHAP. 9] EIGENVALUES AND EIGENVECTORS 203 2 1 2 2 5 Theorem 9.11: The characteristic and minimum polynomials of a matrix A have the same irreducible factors. This theorem does not say that w(i) = A(t); only that any irreducible factor of one must divide the other. In particular, since a linear factor is irreducible, m(t) and A(t) have the same linear factors; hence they have the same roots. Thus from Theorem 9.6 we obtain Theorem 9.12: A scalar A is an eigenvalue for a matrix A if and only if A is a root of the minimum poljTiomial of A. Example 9.11: Find the minimum polynomial ■>«(*) of the matrix A The characteristic polynomial of A is A(t) = |t/- A] = (t- 2)»(t- 5). By Theorem 9.11, both t — 2 and t — 5 must be factors of m(t). But by Theorem 9.10, vnif) must divide A(t); hence m(i) must be one of the following three polynomials: TOi(t) = (t-2)(t-5), W2(t) = (t-2)2(t-5), m^fy = (t-2)3(t-5) We know from the Cayley-Hamilton theorem that in^iA) — A(A) = 0. The reader can verify that »ni(A) # but WgCA) = 0. Accordingly, m^it) = {t — 2)^ (t — 5) is the minimum polynomial of A. Example 9.12: Let A be a 3 by 3 matrix over the real field R. We show that A cannot be a zero of the polynomial f{t) = t^ + 1. By the Cayley-Hamilton theorem, A is a zero of its characteristic poljmomial A(t). Note that A(t) is of degree 3; hence it has at least one real root. Now suppose A is a zero of f{t). Since f(t) is irreducible over R, f{t) must be the minimal polynomial of A. But /(t) has no real root. This contradicts the fact that the characteristic and minimal polynomials have the same roots. Thus A is not a zero of f{t). The reader can verify that the following 3 by 3 matrix over the complex field C is a zero of f(t): fo -1 0^ 10 lO CHARACTERISTIC AND MINIMUM POLYNOMIALS OF LINEAR OPERATORS Now suppose T:V^V is a linear operator on a vector space V with finite dimension. We define the characteristic polynomial A(<) of T to be the characteristic polynomial of any matrix representation of T. By Theorem 9.9, A{t) is independent of the particular basis in which the matrix representation is computed. Note that the degree of A{t) is equal to the dimension of V. We have theorems for T which are similar to the ones we had for matrices: Theorem 9.5': T is a zero of its characteristic polynomial. Theorem 9.6': The scalar \GK is an eigenvalue of T if and only if A is a root of the characteristic polynomial of T. The algebraic multiplicity of an eigenvalue \G K of T is defined to be the multiplicity of A as a root of the characteristic polynomial of T. The geometric multiplicity of the eigenvalue A is defined to be the dimension of its eigenspace. Theorem 9.13: The geometric multiplicity of an eigenvalue A does not exceed its algebraic multiplicity. 204 EIGENVALUES AND EIGENVECTORS [CHAP. 9 Example 9.13: Let V be the vector space of functions which has {sin e, cos 9} as a basis, and let D be the differential operator on V. Then D{sme) = cosfl = 0(sin e) + l(cos9) Z>(cos e) = — sin 9 = — l(sin e) + O(cos e) The matrix A of D in the above basis is therefore A = [D] = ( V Thus det (tl -A) = t -1 1 t = t2 and the characteristic polynomial of D is A(t) = t^ + l. On the other hand, the minimum polynomial m{t) of the operator T is defined independ- ently of the theory of matrices, as the polynomial of lowest degree and leading coefficient 1 which has T as a zero. However, for any polynomial f{t), f{T) = if and only if f{A) = where A is any matrix representation of T. Accordingly, T and A have the same minimum polynomial. We remark that all the theorems in this chapter on the minimum polynomial of a matrix also hold for the minimum polynomial of the operator T. Solved Problems POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 9.1. Find f{A) where A = T ~z\ and f{t) ^ f-M + 1. 9.2. Show that ^ = ( « o ) ^^ ^ ^^^^ ^^ f(^) = *^ - 4t - 5. '«-' = --"— a;)'-G :)-<;:) = a: 9.3. Let V be the vector space of functions which has {sin 6, cos ^} as a basis, and let D be the differential operator on V. Show that D is a zero of f{t) = t^ + 1. Apply f(D) to each basis vector: /(D)(sin9) = (Z>2 + 7)(sin e) = r»2(sin e) + /(sine) = -sin « + sin 9 = /(D)(cos e) = {D^ + /)(cos e) = DHcos e) + /(cos e) = -cos 9 + cos e = Since each basis vector is mapped into 0, every vector i; S V is also mapped into by f(D). Thus fm = 0. This result is expected since, by Example 9.13, /(<) is the characteristic polynomial of D. CHAP. 9] EIGENVALUES AND EIGENVECTORS 205 9.4. Let A be a matrix representation of an operator T. Show that /(A) is the matrix representation of f{T), for any polynomial f{t). Let <t> be the mapping T h* A, i.e. which sends the operator T into its matrix representation A. We need to prove that <i>(f(T)) - f(A). Suppose fit) = a„t" -\- • ■ ■ + a^t + a^. The proof is by in- duction on n, the degree of fit). Suppose TO = 0. Recall that 0(/') = / where /' is the identity mapping and / is the identity matrix. Thus <t>{f(T)) = <t>(Hr) = ao0(/') = V = /(A) and so the theorem holds for n = 0. Now assume the theorem holds for polynomials of degree less than n. Then since ^ is an algebra isomorphism, ^if(T)) = 0(tt„r" + a„_irn-i + • • • + aiT + a^') = a„0(r) 0(r»-i) + 0(a„_ir"-i + • • • + aiT + a^') = o„AA«-i + (a„_jA"-i + • • • + aiA + a<,/) = /(A) and the theorem is proved. 9.5. Prove Theorem 9.1: Let / and g be polynomials over K. Let A be a square matrix over K. Then: (i) (/ + fir)(A) = /(A) + flr(A); (ii) {fg){A) = /(A) g{A); and (iii) (fe/)(A) = kf(A) where fc G K. Suppose / = a„t" + • • • + Oi* + Oq and g = b^t^ + • • • + bit + bf,. Then by definition, f(A) = a„A» + • • • + ttiA + Oq/ and ^(A) = ft^A" + • ■ • + biA + bol (i) Suppose m — n and let ftj = if i> m. Then / + sr = (a„+6„)t"+ ••• +K + 6i)t + (ao+6o) Hence (/ + g)(A) = {a„ + 6„)A" +•••+(«! + 6i)A + (a<, + 6o)^ = a„A» + 6„A« + • • • + ajA + 6iA + tto^ + 60/ = /(A) + flr(A) n + m (ii) By definition, fg = c„ +„ t" +>"+•••+ Ci* + Co = 2 c^t'' where c^ = eio^fc + "'i^fc-i + fc = fc n + m + Ojcfro = 2 aj6fc_i. Hence (/ff)(A) = 2 CfcA^ and 1=0 /n N/™ \ nm n + m /(A)f,(A) = ( 2 OiAMf 2 6jAi) = 2 2»M*+^ = 2 CfcA" = (/ff)(A) \i=o /\j=o / t=0 j=0 fc=0 (iii) By definition, kf = A;a„t" + • • • + fcajt + fcoo. and so (fe/)(A) = fca„A» + • • • + fettiA + fcoo/ = A;(a„A» + • • • + ajA + o,/) = k /(A) EIGENVALUES AND EIGENVECTORS 9.6. Let A, be an eigenvalue of an operator T:V^V. Let Vx denote the set of all eigen- vectors of T belonging to the eigenvalue X (called the eigenspace of A.). Show that Vx is a subspace of V. Suppose v,w & Vx, that is, T(v) = \v and r(w) = \w. Then for any scalars a,b & K, T(av + 6w) = a r(i;) + 6 T(w) - aiw) + b{\w) - Mav + bw) Thus av + bw is an eigenvector belonging to X, i.e. av + bw G V^. Hence Vx is a subspace of V. 206 EIGENVALUES AND EIGENVECTORS [CHAP. 9 9.7. Let A = 1 4\ 2 g I . (i) Find all eigenvalues of A and the corresponding eigenvectors. (ii) Find an invertible matrix P such that P-^AP is diagonal. U) {t-5){t+l) The roots of A(t) are 5 and —1, and so these numbers are the eigenvalues of A. We obtain the eigenvectors belonging to the eigenvalue 5. First substitute t = 5 into 4 -4- 5 form the solution of the homogeneous system determined by the above matrix, i.e., 4 -4\/x\ _ /0\ f 4x-4y = .-2 2)\y) ~ \o) ""^ \^2x + 2y = (i) Form the characteristic matrix */ -AoiA: \0 tj \2 3J \ -2 t-3 The characteristic polynomial A(t) of A is its determinant: A(t) = \tI-A\ = «- 1 -4 -2 «- 3 = «2 _ 4t _ 5 = the characteristic matrix (1) to obtain the matrix The eigenvectors belonging to X — 2/ = (In other words, the eigenvectors belonging to 5 form the kernel of the operator tl — A for t = 5.) The above system has only one independent solution; for example, x = 1, y = 1. Thus ■" = (1, 1) is an eigenvector which generates the eigenspace of 5, i.e. every eigenvector belong- ing to 5 is a multiple of v. We obtain the eigenvectors belonging to the eigenvalue —1. Substitute t = — 1 into {1) to obtain the homogeneous system -2 -4 -2 -4 ~2x - 4.y = -2x - 4i/ = The system has only one independent solution; for example, x is an eigenvector which generates the eigenspace of —1. or X + 2j/ = 2, 3/ = -1. Thus w = (2, -1) (ii) Let P be the matrix whose columns are the above eigenvectors: P B = P-^AP = ■1 2 1 -1 B — P~^AP is the diagonal matrix whose diagonal entries are the respective eigenvalues: '1/3 2/3\/l 4\/l 2\ _ /5 1/3 -1/3 JV 2 sjil -1/ " U -1 Then {Remark. Here P is the transition matrix from the usual basis of R2 to the basis of eigen- vectors {v, w}. Hence B is the matrix representation of the operator A in this new basis.) 9.8. For each matrix, find all eigenvalues and a basis of each eigenspace: 1 -3 3\ /-3 1 -l' (i) A = I 3 -5 3 , (ii) 5 = -7 5-1 6-6 4 Which matrix can be diagonalized, and why? -6 6 -2 (1) Form the characteristic matrix tl — A and compute its determinant to obtain the character- istic polynomial A(t) of A: « - 1 3 -3 -3 t + 5 -3 = (t + 2)2(t-4) -6 6 f - 4 A(t) = \tI-A\ = The roots of A(t) are —2 and 4; hence these numbers are the eigenvalues of A. CHAP. 9] EIGENVALUES AND EIGENVECTORS 207 We find a basis of the eigenspace of the eigenvalue —2. Substitute t = — 2 into the char- acteristic matrix tl — A to obtain the homogeneous system f-Sa; + 32/ - 32 = or i —3a; + 3j/ — 3z = or x — 2/ + z = [—6a; + 62/ — 6z == The system has two independent solutions, e.g. a; = l, 2/ = l, z = and a; = 1, j/ = 0, z = —1. Thus u = (1, 1, 0) and v = (1, 0, —1) are independent eigenvectors which generate the eigen- space of —2. That is, u and v form a basis of the eigenspace of —2. This means that every eigenvector belonging to —2 is a linear combination of u and v. We find a basis of the eigenspace of the eigenvalue 4. Substitute t = 4 into the char- acteristic matrix tl — A to obtain the homogeneous system 3x + By - - 3z = Sx + 9y - - 3z = 6x + 6y = X + y — z = 2y - z = The system has only one free variable; hence any particular nonzero solution, e.g. x = 1, y = 1, z = 2, generates its solution space. Thus w = (1, 1, 2) is an eigenvector which generates, and so forms a basis, of the eigenspace of 4. Since A has three linearly independent eigenvectors, A is diagonalizable. In fact, let P be the matrix whose columns are the three independent eigenvectors: Then P-^AP = /-2 \ As expected, the diagonal elements of P~^AP are the eigenvalues of A corresponding to the columns of P. (ii) A(t) = \tI-B\ t + 3 -1 7 t-5 6 -6 1 1 t + 2 = (t + 2)2(t-4) The eigenvalues of B are therefore —2 and 4. We find a basis of the eigenspace of the eigenvalue —2. to obtain the homogeneous system Substitute t = -2 into tl - B X - - y + z = Ix- -ly + z = 6a; - -6y — y + z = y =0 The system has only one independent solution, e.g. a; = 1, 2/ = 1, z = 0. Thus u — (1, 1, 0) forms a basis of the eigenspace of —2. We find a basis of the eigenspace of the eigenvalue 4. Substitute t = 4 into tl —B to obtain the homogeneous system 7a; — 2/ + z = a; = 7a; - 2/ + 2 = 7a; - 2/ + z = 6a;- 62/ + 6Z = The system has only one independent solution, e.g. a; = 0, 2/ = 1, z = 1. Thus v forms a basis of the eigenspace of 4. (0.1,1) Observe that B is not similar to a diagonal matrix since B has only two independent eigenvectors. Furthermore, since A can be diagonalized but B cannot, A and B are not similar matrices, even though they have the same characteristic polynomial. 208 EIGENVALUES AND EIGENVECTORS [CHAP. 9 9.9. Let A = (i) Mt) = \tI-A\ = Find all eigenvalues and the corresponding eigenvectors of A and B viewed as matrices over (i) the real field R, (ii) the complex field C. t- 3 1 -1 t-1 Hence only 2 is an eigenvalue. Put t = 2 into tl — A and obtain the homogeneous system = (2 4i 4 = (t - 2)2 —X + y = —X + y = X — y The system has only one independent solution, e.g. a; = 1, y = 1. Thus i) = (1, 1) is an eigen- vector which generates the eigenspace of 2, i.e. every eigenvector belonging to 2 is a multiple of V. We also have ^B(t) = \tl-B\ = t- 1 -2 1 t + 1 = t2 + 1 Since t2 + i has no solution in R, B has no eigenvalue as a matrix over R. (ii) Since A^(t) = (t — 2)2 has only the real root 2, the results are the same as in (i). That is, 2 is an eigenvalue of A, and v = (1, 1) is an eigenvector which generates the eigenspace of 2, i.e. every eigenvector of 2 is a (complex) multiple of v. The characteristic matrix of B is Ajj(t) = \tl — B\ = t^ + i. Hence i and —i are the eigen- values of B. We find the eigenvectors associated with t = i. Substitute t = i in tl — B to obtain the homogeneous system 'i-1 1 \/x\ _ /0\ / {i-l)x + y = -2 i+lAy) ~ [oj ""^ \-2x + {i+l)y = ° The system has only one independent solution, e.g. x = 1, y — 1 - an eigenvector which generates the eigenspace of i. Now substitute t = —i into tl — B to obtain the homogeneous system '-i — 1 1 \/x\ /0\ { (—i—l)x + y = Q (i -l)x + y = Thus w = (l,l — i) is -2x + {-i - l)y (-i -l)x + y = The system has only one Independent solution, e.g. x = 1, y = 1 + i. Thus w' = (1,1-1- i) is an eigenvector which generates the eigenspace of —i. 9.10. Find all eigenvalues and a basis of each eigenspace of the operator T : R^ -* R^ defined by T{x, y, z) = {2x + y,y- z, 2y + 4«). First find a matrix representation of T, say relative to the usual basis of R^: /2 1 0^ A = m = The characteristic polynomial A(*) of T is then A(t) = \tI-A\ = t-2 -1 t-1 -2 1 t- 4 = (t-2)2(t-3) Thus 2 and 3 are the eigenvalues of T. We find a basis of the eigenspace of the eigenvalue 2. Substitute t = 2 into tl — A to obtain the homogeneous system / ™ \ /a\ r «. — n y =0 y + z = CHAP. 9] EIGENVALUES AND EIGENVECTORS 209 The system has only one Independent solution, e.g. x = 1, y = 0, z = 0. Thus u = (1, 0, 0) forms a basis of the eigenspace of 2. We find a basis of the eigenspace of the eigenvalue 3. Substitute t — 3 into tl — A to obtain the homogeneous system The system has only one independent solution, e.g. x — 1, y — 1, z = —2. Thus v = (1, 1, —2) forms a basis of the eigenspace of 3. Observe that T is not diagonalizable, since T has only two linearly independent eigenvectors. 9.11. Show that is an eigenvalue of T if and only if T is singular. We have that is an eigenvalue of T if and only if there exists a nonzero vector v such that T{v) = Ov = 0, i.e. that T is singular. 9.12. Let A and B be w-square matrices. Show that AB and BA have the same eigenvalues. By Problem 9.11 and the fact that the product of nonsingular matrices is nonsingular, the fol- lowing statements are equivalent: (i) is an eigenvalue of AB, (ii) AB is singular, (iii) A or B is singular, (iv) BA is singular, (v) is an eigenvalue of BA. Now suppose X is a nonzero eigenvalue of AB. Then there exists a nonzero vector v such that ABv = Xv. Set w = Bv. Since \ # and v ¥= 0, Aw = ABv = \v ¥= and so w # But w is an eigenvector of BA belonging to the eigenvalue X since BAw - BABv = B\v = \Bv = \w Hence X is an eigenvalue of BA. Similarly, any nonzero eigenvalue of BA is also an eigenvalue of AB. Thus AB and BA have the same eigenvalues. 9.13. Suppose A. is an eigenvalue of an invertible operator T. Show that A~* is an eigenvalue of T-K Since T is invertible, it is also nonsingular; hence by Problem 9.11, X # 0. By definition of an eigenvalue, there exists a nonzero vector i; for which T(v) — Xv. Apply- ing r-i to both sides, we obtain v = T-i(\v) = xr-i(i;). Hence r-i(v) = X-iv; that is, X"! is an eigenvalue of r~i. 9.14. Prove Theorem 9.3: Let Vi, . . .,Vn be nonzero eigenvectors of an operator T:V ->V belonging to distinct eigenvalues Ai, . . . , A„. Then vi, . . .,Vn are linearly independent. The proof is by induction on n. Ii n — 1, then Vj is linearly independent since Vi # 0. Assume « > 1. Suppose OiV, -I- 02^2 + • • • + a„v„ = {1) where the Oj are scalars. Applying T to the above relation, we obtain by linearity aiT(Vi) + <hT(v.,) + ■■■ + a„T(v„) = T{0) ^ But by hypothesis r(i;j) = XjVj; hence ajXiVi + 02X2^2 + • • • + ttn^n^n = (2) 210 EIGENVALUES AND EIGENVECTORS [CHAP. 9 On the other hand, multiplying (1) by X„, Now subtracting (5) from (2), ai(Xi - Xj^i + a2{\2~K)'"2 + • • • + a„_i(\„_i - Xjiin-i = By induction, each of the above coefficients is 0. Since the Xj are distinct, Xj — X„ 9^ for i # w. Hence aj = • • • = a^-i = 0. Substituting this into (1) we get a„i;„ = 0, and hence a„ = 0. Thus the Vi are linearly independent. CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM 9.15. Consider a triangular matrix 'an ai2 a22 ... CUnj Find its characteristic polynomial A{t) and its eigenvalues. Since A is triangular and tl is diagonal, tl — A is also triangular with diagonal elements t — ««: /*-«!! -<»12 • • • t 0.22 -a In "-in t-anni tl - A ^ \ Then A(t) = |t/ — A| is the product of the diagonal elements t — ««: A(t) = (t - aii)(t - a22) •••(*- «nn) Hence the eigenvalues of A are an, tt22, . . . , ««„, i.e. its diagonal elements 1 2 3\ 9.16. Let A = I 2 3 . Is A similar to a diagonal matrix? If so, find one such matrix. 3/ Since A is triangular, the eigenvalues of A are the diagonal elements 1, 2 and 3. Since they are distinct, A is similar to a diagonal matrix whose diagonal elements are 1, 2 and 3; for example, /i o^ 2 \0 3; 9.17. For each matrix find a polsoiomial having the matrix as a root /l 4-3 « ^ = (i -3> <«)« = (' :4)' <'") ^ = » \\ ■I By the Cayley-Hamilton theorem every matrix is a root of its characteristic polynomial. Therefore we find the characteristic polynomial A(t) in each case. (i) A(t) = \tI-A\ = (ii) A(«) == \tI-B\ = (iii) A(t) = \tI-C\ = t-2 -B -1 f + 3 t-2 3 -7 t + 4 + t - 11 = (2 + 2t + 13 t-1 -4 3 t - 3 -1 -2 t+1 (t - l)(t2 -2t-5) CHAP. 9] EIGENVALUES AND EIGENVECTORS 211 9.18. Prove the Cayley-Hamilton Theorem 9.5: Every matrix is a zero of its characteristic polynomial. Let A be an arbitrary w-square matrix and let A(i) be its characteristic polynomial; say, A(t) = \tI-A\ = <«+ a„_it"-i +••■+«!* + do Now let B(t) denote the classical adjoint of the matrix tl — A. The elements of B{t) are cofactors of the matrix tl — A and hence are polynomials in t of degree not exceeding n — 1. Thus B(t) = B„_it"-i + ■ ■ ■ + B^t + Bo where the Bj are re-square matrices over K which are independent of t. By the fundamental property of the classical adjoint (Theorem 8.8), {tI-A)B{t) - \tI-A\I or (i/-i4)(B„_it«-i+ ••• +Bi« + Bo) = (<" + a„_it»-i+ l-ai« + aoK Removing parentheses and equating the coefficients of corresponding powers of t, Bn-l = 1 Bn-2~AB„_i — a„-il Bo - ABi = a,/ -ABo = agl Multiplying the above matrix equations by A", A"-^, . . ., A, I respectively, A"Bn-i = A" An-iB„_2 - A«J?„_i 3^ a„-iA»~i A»-2B„_3-A«-iB„_2 = a„_2A"-2 ABo - A^Bi - a^A —ABo = »o^ Adding the above matrix equations, = A» + a„_iA«-i + • • • + aiA + oo/ In other words, A(A) = 0. That is, A is a zero of its characteristic polynomial. 9.19. Show that a matrix A and its transpose A' have the same characteristic polynomial. By the transpose operation, (t/-A)' = tl* — A* - tl - A*. Since a matrix and its transpose have the same determinant, \tI — A\ - |(t7-A)*| := \tI-A*\. Hence A and A* have the same char- acteristic polynomial. /Ai B\ 9.20. Suppose M = \ ^ ^1 where Ai and Az are square matrices. Show that the char- acteristic polynomial of M is the product of the characteristic poljmomials of Ai and Ai. Generalize. /tl-A, ~B \ tl-M = I tl-A/' ^®"*=^ ^y Problem 8.70, \tI-M\ = |t/ — A| \tl — B\, as required. By induction, the characteristic poljmomial of the triangular block matrix tl - Ai -B tl - A^ 212 EIGENVALUES AND EIGENVECTORS [CHAP. 9 M = lA, B A2 C D \ ... A^l where the Aj are square matrices, is the product of the characteristic polynomials of the Aj. MINIMUM POLYNOMIAL 9.21. Find the minimum polynomial m{t) of A = The characteristic polynomial of A is 2 1 2 1 1 -2 4 A(f) = t-2 -1 t-2 t- 1 t-2. -1 t-2 t-1 -1 2 i-4 = (t-3)(t-2)3 The minimum polynomial m{t) must divide A(i). Also, each irreducible factor of A(t), i.e. t — 2 and < — 3, must be a factor of m{t). Thus m(t) is exactly one of the following: /(t) = (t-3)(t-2), flr(t) = (i-3)(«-2)2, /i(«) = (t-3)(t-2)3 We have /(A) = (A-37)(A-2/) = ff(A) = (A - 37)(A - 27)2 = -1 1 o\ 1' 1 -1 '0 -2 1 \: -1 1 -2 1/ -2 2 1 1 o\ lo 1 -1 -2 1 -1 1 ¥- = 0-2 ,0 Thus g{t) = (t — 3)(t — 2)2 is the minimum polynomial of A. Remark. We know that h{A) — A(A) =; by the Cayley-Hamilton theorem. However, the degree of g{t) is less than the degn:ee of h(t); hence g(f), and not h(t), is the minimum poljmomial of A. 9.22. Find the minimal polynomial m{t) of each matrix (where a¥^Oy. ^ \o */ \o A, (i) The characteristic polynomial of A is A(t) - {t — X)2. We find A — \/ # 0; hence m{t) — A(t) = (t-\)K (ii) The characteristic polynomial of B is A(i) = (t — X)^. (Note m(t) is exactly one of t — \, {t — X)2 or {t - X)3.) We find (B - X/)2 ¥- 0; thus OT(t) = A(t) = (t - X)S. (iii) The characteristic polynomial of C is A(t) = (t - X)*. We find (C - X/)3 # 0; hence m{t) = A(t) = (t-X)'«. CHAP. 9] EIGENVALUES AND EIGENVECTORS 213 9.23. (A 0\ Let M = I Q g 1 where A and B are square matrices. Show that the minimum polynomial m(f) of M is the least common multiple of the minimum polynomials g(f) and fe(t) of A and B respectively. Generalize. "m,(A.) m(B)^ Since m(V) is the minimum polynomial of M, m(Af) = = and hence ■m(A) = and mifi) = 0. Since g{t) is the minimum polynomial of A, g{t) divides m{t). Similarly, h{t) divides m{t). Thus m(t) is a multiple of g(t) and h(t). /f(A) \ /O 0^ Now let f{t) be another multiple of g{t) and h(t); then /(M) = = 0. V /(B), But m{t) is the minimum polynomial of M; hence m(t) divides /(t). Thus m{t) is the least common multiple of g{t) and /i(t). We then have, by induction, that the minimum polynomial of M = \o where the Aj are square matrices, is the least common multiple of the minimum polynomials of the A,. 9.24. Find the minimum polynomial m(i) of '2 M Let A 2 8 2 c D = (5). The minimum polynomials of A, C and D are (t — 2)^, t2 and t — 5 respectively. The characteristic polynomial of B is \tI-B\ = t- 4: -2 -1 t-3 t2 7t + 10 = (t-2)(t-5) and so it is also the minimum polynomial of B. '\ Observe that M = 'A B C ,0 2?/ polynomials of A, B, C and D. Accordingly, m(t) = tm - 2)2(t — 5) Thus m(t) is the least common multiple of the minimum 9.25. Show that the minimum polynomial of a matrix (operator) A exists and is unique. By the Cayley-Hamilton theorem, A is a zero of some nonzero polynomial (see also Problem 9.31). Let n be the lowest degree for which a polynomial f(t) exists such that /(A) = 0. Dividing f(t) by its leading coefficient, we obtain a monic polynomial m(t) of degree n which has A as a zero. Sup- pose m'(t) is another monic polynomial of degree n for which m'{A) = 0. Then the difference m{t) — m'(t) is a nonzero polynomial of degree less than n which has A as a zero. This contradicts the original assumption on n; hence m(t) is a unique minimum polynomial. 214 EIGENVALUES AND EIGENVECTORS [CHAP. 9 9.26. Prove Theorem 9.10: The minimum polynomial m(t) of a matrix (operator) A divides every polynomial which has A as a zero. In particular, m{t) divides the char- acteristic polynomial of A. Suppose f(t) is a polynomial for which f{A) = 0. By the division algorithm there exist poly- nomials q{t) and r{t) for which f{t) = m{t) q(t) + r(t) and r(t) = or deg r(t) < deg m(t). Sub- stituting t = A in this equation, and using that f(A) = and 'm{A) = 0, we obtain r(A) = 0. If r{t) ¥= 0, then r(t) is a polynomial of degree less than m(t) which has A as a zero; this contradicts the definition of the minimum polynomial. Thus r(t) = and so f{t) = m{t) q(t), i.e. m(t) divides /(*)• 9.27. Let m{t) be the minimum polynomial of an %-square matrix A. Show that the char- acteristic polynomial of A divides (m(i))''. Suppose m(t) = f + Cjf-i -|- • • ■ -I- c^-it + c^. Consider the following matrices: Bo = / Bi = A + cj Bo = A^ + CjA + Cg/ B^_i = A"--! + CiA'-2 4- • ■ • + c^_i7 Then B^ = I Bi - ABg = cj B^-AB^ = C2I Also, -AB^^i = C^ - {Ar+CiAr-i+ ■■■ +Cr-iA+CrI) = c^ — «i(A) = c^I Set B{t) = f-i^o + f-^Bi + • • ■ + tBr-2 + Br-1 Then {tI-A)-B(t) = (t'-Bo+t'-'^Bi+ •■• +tBr-i) - (t'-^ABo + tr-^ABi+ ••• +ABr-i) = t^Bo+ tr-i{Bi-ABo)+ t'-2(B2-ABi)+ •■■ -f t(B,._i - AB^-a) - AB^-i = f/ -1- Cif-l/ + C^f-^I + ••• + Cr-itl + C^ = m(t)I The determinant of both sides gives \tl — A\ \B{t)\ = \m(t) I\ = (TO(t))". Since \B(t)\ is a polynomial, 1*7 — A I divides (m(t))"; that is, the characteristic polynomial of A divides (»n(t))". 9.28. Prove Theorem 9.11: The characteristic polynomial A{t) and the minimum poly- nomial m{t) of a matrix A have the same irreducible factors. Suppose f{t) is an irreducible polsmomial. If f{t) divides m{t) then, since m(t) divides A(t), f(t) divides A(t). On the other hand, if f(t) divides A(t) then, by the preceding problem, /(«) divides (m(t))^. But f(t) is irreducible; hence f(t) also divides m{t). Thus m{t) and A(t) have the same irreducible factors. 9.29. Let r be a linear operator on a vector space V of finite dimension. Show that T is invertible if and only if the constant term of the minimal (characteristic) polynomial of T is not zero. Suppose the minimal (characteristic) polynomial of T is f(t) = f + a„_it'— 1 -I- • • • + a^t + a^f. Each of the following statements is equivalent to the succeeding one by preceding results: (1) T is invertible; (ii) T is nonsingular; (iii) is not an eigenvalue of T; (iv) is not a root of m(t); (v) the constant term Uf, is not zero. Thus the theorem is proved. CHAP. 9] EIGENVALUES AND EIGENVECTORS 215 9.30. Suppose dimF = ». Let T:V ^V he an invertible operator. Show that T-* is equal to a polynomial in T of degree not exceeding n. Let m(t) be the minimal polynomial of T. Then m{t) = f + a^-if— i + • • • + a^t + a^, where r — n. Since T is invertible, Kq # 0. We have Hence m(r) = y + a^_ir'-i + + aJ)T = I and r-i = OiT + do/ = 1 tto (Tr-i + ar-i r'-2 + • • • + aj) MISCELLANEOUS PROBLEMS 9.31. Let T be a linear operator on a vector space V of dimension n. Without using the Cayley-Hamilton theorem, show that T is a zero of a nonzero polynomial. Let N - rfi. Consider the following N-\-l operators on V: I, T,T^, .... T^. Recall that the vector space A(V) of operators on V has dimension N — rfi. Thus the above iV + 1 operators are linearly dependent. Hence there exist scalars a^, Oj, . . ., aj, for which a^^T^ + •• • + a^T + a^ = 0. Accordingly, T is a zero of the polynomial f{t) = a^^t^ + ••• + a^t + Oq. 9.32. Prove Theorem 9.13: Let A be an eigenvalue of an operator T:V ^V. The geometric multiplicity of X does not exceed its algebraic multiplicity. Suppose the geometric multiplicity of X is r. Then X contains r linearly independent eigenvectors Vi, ...,Vr. Extend the set {dJ to a basis of V: {v^, ...,v^, w^, . . .,Wg}. We have T{vi) = XVi 1 1 '(^2) = \V2 \Vr) = W, T(w,) = aiiVi + ... + airVr + 6nWi + • • • + bi,w. I 7 '(^2) = a2iVi + ... + a2rVr + b2tWi + ■ • + b2s'Ws '(W.) = O'sl'Vl + ... + O'sr'Or + &SIW1 + • ■ + bss^s The matrix of T in the above basis is /^ { «ii «21 •■• 0-sl\ r. X 1 «'12 »22 • • • «s2 \ = (^'l M = 1 . X l<Hr ttgr ■ • • O'sr A^ » 1 *" 621 ... 6,1 Vo 5/ ". 1 &12 622 • • • 6r2 \. 1 hs hs ■•■ 6ss / where A = (0.;^)' and B = (»«)•• By Problem 9.20 the characteristic polynomial of X/,, which is (t — \Y, must divide the char- acteristic polynomial of M and hence T. Thus the algebraic multiplicity of X for the operator T is at least r, as required. 9.33. Show that A = is not diagonalizable. The characteristic polynomial of A is A(t) = (t — 1)^; hence 1 is the only eigenvalue of A. We find a basis of the eigenspace of the eigenvalue 1. Substitute t = 1 into the matrix tl — A to obtain the homogeneous system 216 EIGENVALUES AND EIGENVECTORS [CHAP. 9 o)(^) = (I) «'• { = The system has only one independent solution, e.g. x = 1, y = 0. Hence u — (1, 0) forms a basis of the eigenspace of 1. Since A has at most one independent eigenvector, A cannot be diagonalized. 9.34. Let F be an extension of a field K. Let A be an w-square matrix over K. Note that A may also be viewed as a matrix A over F. Clearly \tl ~ A\ = \tl - A\, that is, A and A have the same characteristic polynomial. Show that A and A also have the same minimum polynomial. Let m{t) and m'(t) be the minimum polynomials of A and A respectively. Now m'{t) divides every polynomial over F vifhich has A as a zero. Since m(t) has A as a zero and since m{t) may be viewed as a polynomial over F, m'{t) divides m(t). We show now that m(t) divides m'(t). Since m'(t) is a polynomial over F which is an extension of K, we may write m'(t) = fi(t)hi + /aft) 62 + ■•• + fnit)b„ where /i(«) are polynomials over K, and 61, . . . , 6„ belong to F and are linearly independent over K. We have m'{A) = /i(A)bi + /2(A)62 + ••• + /„(A)6„ = (1) Let alP denote the y-entry of /fc(A). The above matrix equation implies that, for each pair {i,j), a</'6i + aif 62 + • • ■ + a^f &„ = Since the 64 are linearly independent over K and since the a\P S K, every tty''' = 0. Then /i(A) = 0, /2(A) = 0, ..., /„(A) = Since the /i(t) are polynomials over K which have A as a zero and since m(t) is the minimum poly- nomial of A as a matrix over K, m(t) divides each of the /;(«)• Accordingly, by (1), m(t) must also divide m'(t). But monic polynomials which divide each other are necessarily equal. That is, m(t) = tn'{t), as required. 9.35. Let {vi, . . . , v„} be a basis of 7. Let T : F -* 7 be an operator for which T{vi) = 0, T{V2) = chiVi, T{vs) - aaivi + asiVi, . . ., Tivn) = a„iVi + • • • + an,n-iv„-i. Show that T"- 0. It suffices to show that Ti{Vj) = (*) for i — 1, . . .,n. For then it follows that THvj) - T»-i(Ti(Vj)) = r"-5(0) = 0, for ^ = 1 n and, since {v^, ...,■«„} is a basis, T" = 0. We prove {*) by induction on j. The case j = 1 is true by hypothesis. The inductive step follows (for j = 2, . . .,n} from Ti{Vj) = Ti-HT(Vj)) = TS-HajiVi+ ■■■+aj^j.iVj^i) = aj^Ti-Hvi) + ••• +aj.,_irJ-i(Vj_i) = ajiO + ■•• + aj,j_iO = Remark. Observe that the matrix representation of T in the above basis is triangular with diagonal elements 0: /O a^i ttsi ... a-ni 032 .. . a„2 ... a„,„_i \0 ... CHAP. 9] EIGENVALUES AND EIGENVECTORS 217 Supplementary Problems POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 9.36. Let f(t) = 2*2 - 5t + 6 and g(t) = t^ - 21^ + t + i. Find f(A), g{A), f(B) and g(B) where 9.37. Let r:E2^R2 be defined by T(x,y) = {x + y,2.x). Let /(i) = t2 - 2t + 3. ¥\n& f(T)(x,y). 9.38. Let V be the vector space of polynomials v(x) = ax^ + hx + c. Let D:V ^V be the differential operator. Let f{t) = fi + 2t ~ 5. Find f(D){v{x)). 9.39. Let A .0 Find A2, A3, A". '8 12 0^ 9.40. Let B = I 8 12 8/ Find a real matrix A such that B = As. 9.41. Consider a diagonal matrix M and a triangular matrix N: ' «! ... \ a. and AT a^ b ... c 02 ... d \ M = " "-2 • ... a J Show that, for any polynomial f(t), f(M) and f(N) are of the form '/(tti) ... \ //(ai) X f(M) = I ^^ f^"'^^ ■■■ and /(AT) = ... f{aj 9.42. Consider a block diagonal matrix M and a block triangular matrix N: /Ai ... \ /Ai B M = A2 ... \ ... A„, and N = A, C D where the Aj are square matrices. Show that, for any polynomial /(*), f(M) and f{N) are of the form '/(Ai) ... \ //(Ai) X ... F \ f{M) = I /(A2) ... ^^^ ^(^) ^ I /(A2) ... ... /(A„)/ \ ... /(Aj/ 9.43. Show that for any square matrix (or operator) A, (P^iAP)" = P-iA"F where P is invertible. More generally, show that f{P-^AP) = p-if{A)P for any polynomial f(t). 9.44. Let f{t) be any polynomial. Show that: (i) /(Af) = (/(A))*; (ii) if A is symmetric, i.e. A* = A, then /(A) is symmetric. EIGENVALUES AND EIGENVECTORS 9.45. For each matrix, find all eigenvalues and linearly independent eigenvectors: /22\ /42\ /5-1 « ^ =(1 3)' (") ^ =(3 3)' (-) ^=(13 Find invertible matrices Pj, Pg and P3 such that P~iAPi, P-^BP^ and P~^CPs are diagonal. 218 EIGENVALUES AND EIGENVECTORS [CHAP. 9 9.46. For each matrix, find all eigenvalues and a basis for each eigenspace: /3 1 l\ / 1 2 2\ /l 1 0^ (i) A := 2 4 2 , (ii) B = 1 2 -1 , (iii) C = 1 \1 1 3/ \-l 14/ \0 ly When possible, find invertible matrices Pj, P2 and P3 such that P~^APi, F^'BPj ^"d Pj^CPs are diagonal. 9.47. Consider A = I ^ ) and B = I ) as matrices over the real field R. Find all eigen- Vl 4y \1Z -3/ values and linearly independent eigenvectors. 9.48. Consider A and B in the preceding problem as matrices over the complex field C. Find all eigen- values and linearly independent eigenvectors. 9.49. For each of the following operators T -.K^ -* R^, find all eigenvalues and a basis for each eigen- space: (i) T{x,y) = (3x + 3y,x + 5y); (ii) T{x,y) = (y,x); (iii) T(x,y) = (y,-x). 9.50. For each of the following operators T : R^ -> R3, find all eigenvalues and a basis for each eigenspace: (i) T{x, y,z) — (x + y + z, 2y + z, 2y + 3«); (ii) T(,x, y,z) = (x + y,y + z, —2y — z); (iii) t\x, y, 2) = (x-y,2x + 3y + 2z, x-^y + 2z). 9.51. For each of the following matrices over the complex field C, find all eigenvalues and linearly independent eigenvectors: <"(: ;)• ""(J D- «G:r). «(;:?; 9.52. Suppose v is an eigenvector of operators S and T. Show that v is also an eigenvector of the operator aS + bT where a and 6 are any scalars. 9.53. Suppose v is an eigenvector of an operator T belonging to the eigenvalue X. Show that for n > 0, V is also an eigenvector of T" belonging to \". 9.54. Suppose X is an eigenvalue of an operator T. Show that /(X) is an eigenvalue of f(T). 9.55. Show that similar matrices have the same eigenvalues. 9.56. Show that matrices A and A* have the same eigenvalues. Give an example where A and A' have different eigenvectors. 9.57. Let S and T be linear operators such that ST — TS. Let X be an eigenvalue of T and let W be its eigenspace. Show that W is invariant under S, i.e. S(W) C W. 9.58. Let y be a vector space of finite dimension over the complex field C. Let W # {0} be a subspace of V invariant under a linear operator T : V -* V. Show that W contains a nonzero eigenvector of T. 9.59. Let A be an ii-square matrix over K. Let v^, . . .yV^G K" be linearly independent eigenvectors of A belonging to the eigenvalues Xj, . . . , X„ respectively. Let P be the matrix whose columns are the vectors Vi,...,v„. Show that P~^AP is the diagonal matrix whose diagonal elements are the eigenvalues Xj, . . . , X„. CHARACTERISTIC AND MINIMUM POLYNOMIALS 9.60. For each matrix, find a polynomial for which the matrix is a root: '2 3 -2^ 4 1. ^^ ^ = (4 D' (") ^ = (3 3)' ^"^ '^='[1 i_i CHAP. 9] EIGENVALUES AND EIGENVECTORS 219 9.61. Consider the w-square matrix A = Show that /(«) = (t- X)™ is both the characteristic and minimum polynomial of A. 9.62. Find the characteristic and minimum polynomials of each matrix: A = /2 5 0\ 2 4 2 3 5 \0 7/ B 3 1 3 3 1 3 1 3 c = IX 0\ 0X000 0X00 0X0 VO x/ 1 1 0\ /2 0\ 9.63. Let A = 2 and B = 2 2 . Show that A and B have different characteristic \0 1/ \0 1/ polynomials (and so are not similar), but have the same minimum polynomial. Thus nonsimilar matrices may have the same minimum polynomial. 9.64. The mapping T:V -*V defined by T{v) = kv is called the scalar mapping belonging to k & K. Show that T is the scalar mapping belonging to k&K if and only if the minimal polynomial of T is m(t) = t-k. 9.65. Let A be an n-square matrix for which A^ = Q for some k> n. Show that A" = 0. 9.66. Show that a matrix A and its transpose A* have the same minimum polynomial. 9.67. Suppose f(t) is an irreducible monic polynomial for which f(T) = where T Is a linear operator T:V ^V. Show that f(f) is the minimal polynomial of T. 9.68. Consider a block matrix M = I V Show that tl — M = ( _ fj _ n ) is the char- acteristic matrix of Af . \^ " / \ ^ 9.69. Let r be a linear operator on a vector space V of finite dimension. Let Tf' be a subspace of V invariant under T, i.e. T(W) cW. Let Tyf-.W-^W be the restriction of T to W. (i) Show that the characteristic polynomial of Tt„ divides the characteristic polynomial of T. (ii) Show that the minimum polynomial of Tyf divides the minimum polynomial of T. 9.70. Let A = A(«) ail «i2 <*i3 '*21 "^22 "'•23 , a^i a^2 <^33 Show that the characteristic polynomial of A is (ail + 022 + a33)f2 + "ll ''■12 "^21 '*22 "•11 "'13 "31 "33 "22 "23 "32 "33 t Hi ^■13 "21 "22 "23 "31 032 "S3 9.71. Let A be an m-square matrix. The determinant of the matrix of order n — m obtained by deleting the rows and columns passing through m diagonal elements of A is called a principal minor of degree n~m. Show that the coefficient of t™ in the characteristic polynomial A(t) = \tI — A\ is the sum of all principal minors of A of degree n — m multiplied by (— l)"-™. (Observe that the preceding problem is a special case of this result.) 220 EIGENVALUES AND EIGENVECTORS [CHAP. 9 9.72. Consider an arbitrary monic polynomial f(t) = t" + a„_if"^i + ••• + a^t + a^. The following n-square matrix A is called the companion matrix of f{t): . . —do 1 . . -«1 1 . . -«£ . . 1 -ftn-l Show that f{t) is the minimum polynomial of A. 9.73. Find a matrix A whose minimum polynomial is (i) t^ - St^ + 6t + 8, (ii) t^ - 51^ - 2t + 7t + 4. DIAGONALIZATION 9.74. Let A = ( , ) be a matrix over the real field R. Find necessary and sufficient conditions on \c dj a, b, c and d so that A is diagonalizable, i.e. has two linearly independent eigenvectors. 9.75. Repeat the preceding problem for the case that A is a matrix over the complex field C. 9.76. Show that a matrix (operator) is diagonalizable if and only if its minimal polynomial is a product of distinct linear factors. 9.77. Let A and B be Ji-square matrices over K such that (i) AB = BA and (ii) A and B are both diagonalizable. Show that A and B can be simultaneously diagonalized, i.e. there exists a basis of K^ in which both A and B are represented by diagonal matrices. (See Problem 9.57.) 9.78. Let E iV ^V be a projection operator, i.e. E^ = E. Show that E is diagonalizable and, in fact, can be represented by the diagonal matrix A = i ^ j where r is the rank of E. Answers to Supplementary Problems -26 -3 9.36. f(A) = ( 5 _27 ) . 9{A) -40 39\ /3 6\ /3 12 -65 -27)' /(^)=(o ,)' ^(^) = (o 15 9.37. f(T)(x, y) = (4a; - y, -2x + 5y). 9.38. f(D){v(x)) = -5aa;2 + (4a - 5b)x + (2a + 26 - 5c). "'• --': I)' ^'-(i t)' ^-ii: '2 a 61 9.40. Hint. Let A = \ 2 c ) . Set B = A^ and then obtain conditions on a, 6 and c. .0 2j 9.44. (ii) Using (i), we have (/(A))« = /(A«) = /(A). 9.45. (i) Xi = 1, M = (2, -1); Xg = 4, -u = (1, 1). (ii) Xi = 1, M = (2, -3); X2 = 6, 1; = (1, 1). (iii) \ = 4, u= (1,1). 2 1\ / 2 1\ = ( 1. P3 does not exist since C has only one independent eigenvector, and so cannot be diagonalized. Let Pi = ( _^ J ) and P2 CHAP. 9] EIGENVALUES AND EIGENVECTORS 221 9.46. (i) Xi = 2, M = (1, -1, 0), V = (1, 0, -1); Xj = 6, w = (1, 2, 1). (ii) Xi = 3, M = (1, 1, 0), V = (1, 0, 1); X2 = 1, w = (2, -1, 1). (iii) X = 1, tt = (1, 0, 0), V = (0, 0, 1). / 1 1 1\ /I 1 2\ Let Pj = — 1 2 and P2 — \ 1 — 1 I . P3 does not exist since C has at most two \ -1 1/ \0 1 1/ linearly independent eigenvectors, and so cannot be diagonalized. 9.47. (i) X = 3, M = (1,-1); (ii) B has no eigenvalues (in R). 9.48. (i) X = 3, M = (1, -1). (ii) Xj = 2i, u = (1, 3 - 2i); Xg = -2i, i; = (1, 3 + 2i> 9.49. (i) Xi = 2, M = (3, -1); X2 = 6, 1) =: (1, 1). (ii) Xi = 1, m = (1, 1); X2 = -1, v = (1, -1). (iii) There are no eigenvalues (in B). 9.50. (i) Xi = 1, M = (1, 0, 0); Xj = 4, 1; = (1, 1, 2). (ii) X = 1, M = (1, 0, 0). There are no other eigenvalues (in R). (iii) Xi = 1, M = (1, 0, -1); X2 = 2, ■!; = (2, -2, -1); X3 = 3, w = (1, -2, -1). 9.51. (i) Xi = 1, M = (1,0); X2 = i, 1) = (1,1 + i). (ii) X = 1, m = (1,0). (iii) Xj = 2, u = (3,i); Xj = -2, V = (1, -t). (iv) Xi = t, M = (2, 1 - i); Xg = — i, v = (2, 1 + i). 9.56. Let A = ( ^ ) . Then X = 1 is the only eigenvalue and v = (1, 0) generates the eigenspace \^ '' /I 0\ of X = 1. On the other hand, for A* = ( j , X = 1 is still the only eigenvalue, but w = (0, 1) generates the eigenspace of X = 1. ^ 9.57. Let v G W, and so T{v) = Xv. Then T(Sv) = S(Tv) - S(\v) = \(Sv), that is, Sv is an eigenvector of T belonging to the eigenvalue X. In other words, Sv e W and thus S(W) C W. 9.58. Let T:W-*W be the restriction of T to W. The characteristic polynomial of T is a polynomial over the complex field C which, by the fundamental theorem of algebra, has a root X. Then X is an eigenvalue of T, and so T has a nonzero eigenvector in W which is also an eigenvector of T. 9.59. Suppose T(v) = Xv. Then {kT){v) - kT{v) = k(\v) = (k\)v. 9.60. (i) /(<) = t^-St + 43, (ii) g(t) ^t^-8t + 23, (iii) h(t) = t^ - 6*2 + 5f - 12. 9.62. (i) A(t) = (t-2)3(t-7)2; m(f) = (t-2)2(t-7). (ii) Ht) = (<-3)«; m(<) = (t-3)». (iii) A(«) = (t-X)5; m(«) = «-X. /O -8\ 9.73. Use the result of Problem 9.72. (i) A = h -6 , (ii) A = \0 1 5/ 9.77. Hint. Use the result of Problem 9.57. chapter 10 Canonical Forms INTRODUCTION Let r be a linear operator on a vector space of finite dimension. As seen in the preceding chapter, T may not have a diagonal matrix representation. However, it is still possible to "simplify" the matrix representation of T in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary decomposition theorem, and the triangular, Jordan and rational canonical forms. We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic polynomial A{t) of T has all its roots in the base field K. This is always true if K is the complex field C but may not be true if K is the real field R. We also introduce the idea of a quotient space. This is a very powerful tool and will be used in the proof of the existence of the triangular and rational canonical forms. TRIANGULAR FORM Let r be a linear operator on an n-dimensional vector space V. Suppose T can be rep- resented by the triangular matrix (an ai2 ... ttin \ (122 . . . 0,2n ann I Then the characteristic polynomial of T, A{t) = |*/-A| = {t - an){t - a^i) . . . [t - ann) is a product of linear factors. The converse is also true and is an important theorem; namely, Theorem 10.1: Let T:V-^V be a linear operator whose characteristic poljmomial factors into linear polynomials. Then there exists a basis of V in which T is represented by a triangular matrix. Alternate Form of Theorem 10.1: Let A be a square matrix whose characteristic poly- nomial factors into linear polynomials. Then A is similar to a triangular matrix, i.e. there exists an in- vertible matrix P such that P'^AP is triangular. We say that an operator T can be brought into triangular form if it can be represented by a triangular matrix. Note that in this case the eigenvalues of T are precisely those entries appearing on the main diagonal. We give an application of this remark. 222 CHAP. 10] CANONICAL FORMS 223 Example 10.1 : Let A be a square matrix over the complex field C. Suppose X is an eigenvalue of A2. Show that a/x or -Vx is an eigenvalue of A. We know by the above theorem that A is similar to a triangular matrix /Ml B = Hence A^ is similar to the matrix 52 = 1^2 /.? Since similar matrices have the same eigenvalues, X = ya? for some i. Hence /jj = VX or ^j = — -y/x; that is, Vx or - Vx is an eigenvalue of A. INVARIANCE Let T:V-*V be linear. A subspace IF of 7 is said to be invariant under T or T-invariant if T maps W into itself, i.e. if vGW implies T{v) G W. In this case T restricted toW defines a linear operator on W; that is, T induces a linear operator f:W-*W defined by T{w) = T{w) for every w GW. Example 10.2: Let T : K^ ^ R3 be the linear operator which rotates each vector about the z axis by an angle e: T(x, y, z) = {x cose — y sin e, x sine + y cos e, z) Observe that each vector w = - T(v) {a, b, 0) in the xy plane W remains .fj ^ in W under the mapping T, i.e. W ^ ' is r-invariant. Observe also that the z axis U is invariant under T. Furthermore, the restriction of T to W rotates each vector about the origin O, and the restriction of T to TJ is the identity mapping on U. Example 10.3: Nonzero eigenvectors of a linear operator T : V ^ V may be characterized as gen- erators of T-invariant 1-dimensional subspaces. For suppose T{^v) = \v, v # 0. Then W = {kv, k e K}, the 1-dimensional subspace generated by v, is invariant under T because T{kv) = k T(v) = k(\v) = kWEiW Conversely, suppose dim 17 = 1 and m ^ generates U, and U is invariant under T. Then T{u) e V and so T(u) is a multiple of u, i.e. T(u) = /lu. Hence u is an eigenvector of T. The next theorem gives us an important class of invariant subspaces. Theorem 10.2: Let T:V^V be linear, and let f{t) be any polynomial. Then the kernel of f{T) is invariant under T. The notion of invariance is related to matrix representations as follows. Theorem 10.3: Suppose W is an invariant subspace of T:V-^V. Then T has a block A B' matrix representation [ q r^ ] where A is a matrix representation of the restriction of T to W. 224 CANONICAL FORMS [CHAP. 10 INVARIANT DIRECT-SUM DECOMPOSITIONS A vector space V is termed the direct sum of its subspaces Wi, . . .,Wr, written if every vector v GV can be written uniquely in the form 1) = wi + W2+ • • • + iVr with Wi G Wi The following theorem applies. Theorem 10.4: Suppose Wi, ...,Wr are subspaces of V, and suppose {Wn, . . . , Wm^}, . . ., {WrU . . . , tVrn^} are bases of Wi,...,Wr respectively. Then V is the direct sum of the Wi if and only if the union {wn, . . .,wi„i, . . ..wn, . . .,w™J is a basis of V. Now suppose T-.V-^V is linear and V is the direct sum of (nonzero) T-invariant subspaces Wi, . . .,Wr: V = Wi® ■■■ ®Wr and T{Wi) cWi, i^l,...,r Let Ti denote the restriction of T to Wi. Then T is said to be decomposable into the operators Ti or T is said to be the direct sum of the Ti, written T = Ti © • • • Tr. Also, the sub- spaces Wi, ...,Wr are said to redvxe Tor to form a T-invariant direct-sum decomposition of F. Consider the special case where two subspaces U and W reduce an operator T:V-^V; say, dim C/ = 2 and dim W = S and suppose {ui, u^} and (wi, W2, ws) are bases of [/ and W respectively. If Ti and T2 denote the restrictions of T to C7 and W respectively, then T2{wi) = bnWi + h\2W2 + bi3W3 Ti (ui) = anUi + ai2U2 „ , , , , r, ,1, '^ ' T2{W2) = &21W1 + &22W2 + &23W3 Tl (U2) = 0.21^1 + a22U2 rr, , X I. , U I J> .„ ^ -^ T2(W3) = bsiWl + hz2W2 + O33W3 Hence '&n &21 ^31 A = f "" "'' ^ and B = I 612 ^22 b an a2i ttl2 0^22 I 613 &23 ^"33 1 are matrix representations of Ti and Ta respectively. By the above theorem {mi, M2, wi, W2, wz) is a basis of V. Since r(tti) = T,{Ui) and r(Wi) = r2(Wj), the matrix of T in this basis is the block diagonal matrix I „ 1 . A generalization of the above argument gives us the following theorem. Theorem 10.5: Suppose T:V^V is linear and V is the direct sum of T-invariant sub- spaces Wu •■•, Wr. If Ai is a matrix representation of the restriction of T to Wi, then T can be represented by the block diagonal matrix M [Ai ... A2 ... ... A, The block diagonal matrix M with diagonal entries Ai, . . ., A. is sometimes called the direct sum of the matrices Ai, . . . , Ar and denoted by M = Ai © • • • © Ar. CHAP. 10] CANONICAL FORMS 225 PRIMARY DECOMPOSITION The following theorem shows that any operator T:V^V is decomposable into oper- ators whose minimal polynomials are powers of irreducible pols^omials. This is the first step in obtaining a canonical form for T, Primary Decomposition Theorem 10.6: Let T:V^V be a linear operator with minimal polynomial m{t) = /i(f)">/2(t)"^... /.(*)"' where the fi{f) are distinct monic irreducible polynomials. Then V is the direct sum of T-invariant subspaces Wi, . . .,Wr where Wi is the kernel of fi{T)"'. Moreover, /i(;t)«i is the minimal polynomial of the restriction of T to Wi. Since the polynomials /i(i)"* are relatively prime, the above fundamental result follows (Problem 10.11) from the next two theorems. Theorem 10.7: Suppose T:V^V is linear, and suppose f{t) = git)h(t) are polynomials such that f{T) = and g{t) and h(t) are relatively prime. Then V is the direct sum of the T-invariant subspaces U and W, where U = Ker g{T) and W = Ker h{T). Theorem 10.8: In Theorem 10.7, if f{t) is the minimal polynomial of T [and g{t) and h{t) are monic], then g{t) and h{t) are the minimal polynomials of the restric- tions of T to U and W respectively. We will also use the primary decomposition theorem to prove the following useful characterization of diagonalizable operators. Theorem 10.9: A linear operator T -.V ^V has a diagonal matrix representation if and only if its minimal polynomial m{t) is a product of distinct linear polynomials. Alternate Form of Theorem 10.9: A matrix A is similar to a diagonal matrix if and only if its minimal polynomial is a product of distinct linear polynomials. Example 10.4: Suppose A # 7 is a square matrix for which A^ = I. Determine whether or not A is similar to a diagonal matrix if A is a matrix over (i) the real field R, (ii) the complex field C. Since A^ - I, A is a zero of the polynomial f(t) = t^-1 = {t- l){t^ +t + l). The minimal polynomial m(t) of A cannot be t — 1, since A ¥' I. Hence m{t) = t2 + t + 1 or m(t) = t^ - X Since neither polynomial is a product of linear polynomials over R, A is not diag- onalizable over R. On the other hand, each of the polynomials is a product of distinct linear polynomials over C. Hence A is diagonalizable over C. NILPOTENT OPERATORS A linear operator T : F -» V is termed nilpotent if T" = for some positive integer n; we call k the index of nilpotency of T if T'' — but T''-^ ¥= 0. Analogously, a square matrix A is termed nilpotent if A" = for some positive integer n, and of index fc if A'' = but yj^k-i ^ Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is m{t) — f"; hence is its only eigenvalue. The fundamental result on nilpotent operators follows. Theorem 10.10: Let T:V-^V be a nilpotent operator of index k. Then T has a block diagonal matrix representation whose diagonal entries are of the form 226 CANONICAL FORMS [CHAP. 10 1 . . 1 . . . . 1 . . N (i.e. all entries of A^ are except those just above the main diagonal where they are 1). There is at least one N of order k and all other N are of orders ^ k. The number of N of each possible order is uniquely determined by T. Moreover, the total number of N of all orders is equal to the nullity of T. In the proof of the above theorem, we shall show that the number of N of order i is 2mi — mi+i — Mi- 1, where mj is the nullity of T\ We remark that the above matrix N is itself nilpotent and that its index of nilpotency is equal to its order (Problem 10.13). Note that the matrix N of order 1 is just the 1 x 1 zero matrix (0). JORDAN CANONICAL FORM An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend the base field Z to a field in which the characteristic and minimum polynomials do factor into linear factors; thus in a broad sense every operator has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan canonical form. Theorem 10.11: Let T:V ->■¥ be a linear operator whose characteristic and minimum polynomials are respectively A{t) = (t- Ai)"' ...(*- XrY' and m{t) = (i - Ai)"' ...{t- Xr)™- where the Ai are distinct scalars. Then T has a block diagonal matrix representation / whose diagonal entries are of the form /A; 1 ... 0\ Ai 1 ... «/ ij — Ai Ai/ For each A. the corresponding blocks Ja have the following properties: (i) There is at least one Ja of order mi; all other Ja are of order ^ mi. (ii) The sum of the orders of the Ja is m. (iii) The number of Ja equals the geometric multiplicity of Ai. (iv) The number of Ja of each possible order is uniquely determined by T. The matrix J appearing in the above theorem is called the Jordan canonical form of the operator T. A diagonal block Ja is called a Jordan block belonging to the eigenvalue Ai. Observe that Ai ... Ai ... + ... Ai ' . . . Ai Ai 1 . . ^\ Ai 1 . . . . Ai 1 .. . A J 1 .. 1 .. .. 1 .. CHAP. 10] CANONICAL FORMS 227 That is, Jtj = Xil + N where N is the nilpotent block appearing in Theorem 10.10. In fact, we prove the above theorem (Problem 10.18) by showing that T can be decomposed into operators, each the sum of a scalar and a nilpotent operator. Example 105: Suppose the characteristic and minimum polynomials of an operator T are respec- tively A(«) = (f-2)4(t-3)3 and m{t) = («-2)2(t-3)2 Then the Jordan canonical form of T is one of the following matrices: or The first matrix occurs if T has two independent eigenvectors belonging to its eigen- value 2; and the second matrix occurs if T has three independent eigenvectors be- longing to 2. CYCLIC SUBSPACES Let r be a linear operator on a vector space V of finite dimension over K. Suppose V GV and v ^ 0. The set of all vectors of the form f{T){v), where f{t) ranges over all polynomials over K, is a T-invariant subspace of V called the T-cyclic subspace of V gen- erated by v;we denote it by Z{v, T) and denote the restriction of T to Z{v, T) by r„. We could equivalently define Z{v,T) as the intersection of all T-invariant subspaces of V containing v. Now consider the sequence V, T{v), T\v), T\v), . . . of powers of T acting on v. Let k be the lowest integer such that T''{v) is a linear com- bination of those vectors which precede it in the sequence; say, T^iv) = -ttfc-i T'^-^v) - ... - aiT{v) - aov Then m„(i) = t" + ak-it^-^ + ■ ■ ■ + ait + ao is the unique monic polynomial of lowest degree for which mv(T) (v) = 0. We call m„(i) the T-annihilator of v and Z{v, T). The following theorem applies. Theorem 10.12: Let Z(v, T), T^ and m„(i) be defined as above. Then: (i) The set {v, T{v), ..., r'=-i (v)} is a basis of Z{v, T); hence dim Z{v, T) = fe. (ii) The minimal polynomial of T„ is m„(i). (iii) The matrix representation of Tv in the above basis is 228 CANONICAL FORMS [CHAP. 10 . . — tto 1 . . -ai 1 . . -tti . . — aic-2 . . 1 — aic-i The above matrix C is called the companion matrix of the polynomial m„(t). RATIONAL CANONICAL FORM In this section we present the rational canonical form for a linear operator T:V^V. We emphasize that this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall that this is not the case for the Jordan canonical form.) Lemma 10.13: Let T:V-*V be a linear operator whose minimal polynomial is /(*)" where f{t) is a monic irreducible polynomial. Then V is the direct sum V = Z{vi, T) © • • • e Zivr, T) of T-cyclic subspaces Z{Vi, T) with corresponding T-annihilators /(*)"', /(«)"^ ■■-, fit)"', n = Ml ^ %2 - • • • - Wr Any other decomposition of V into jT-cyclic subspaces has the same number of components and the same set of T-annihilators. We emphasize that the above lemma does not say that the vectors vi or the T-cyclic sub- spaces Zivi, T) are uniquely determined by T; but it does say that the set of T-annihilators are uniquely determined by T. Thus T has a unique matrix representation \ Cr where the d are companion matrices. In fact, the Ci are the companion matrices to the polynomials /(*)"*. Using the primary decomposition theorem and the above lemma, we obtain the following fundamental result. Theorem 10.14: Let T:V^V be a linear operator with minimal polynomial m{t) = fi{tr^f2{tr- ... fsitr- where the /{(«) are distinct monic irreducible polynomials. Then T has a unique block diagonal matrix representation 'Cn \ Clrj Cs where the C« are companion matrices. In particular, the C« are the com- panion matrices of the polynomials /i(t)"« where mi = nil — ni2 = ni: \' rris = TCsl — ^52 — • • • — Msr. CHAP. 10] CANONICAL FORMS 229 The above matrix representation of T is called its rational canonical form. The poly- nomials /i(i)»" are called the elementary divisors of T. Example 10.6: Let V be a vector space of dimension 6 over R, and let T be a linear operator whose minimal polynomial is m{t) = (t^-t + 3)(« - 2)2. Then the rational canonical form of T is one of the following direct sums of companion matrices: (i) C(t2-( + 3) e C(«2-t + 3) © C((<-2)2) (ii) C{f2-t + 3) © C((t-2)2) © C((t- 2)2) (iii) C(t2-t + 3) © C((t-2)2) © C(f-2) © C(t-2) where C(f(t)) is the companion matrix of /(t); that is, /O -3 I .L__ 12 (i) (ii) (iii) QUOTIENT SPACES Let F be a vector space over a field K and let T7 be a subspace of V. If v is any vector in V, we write v + W for the set of sums v + w with w GW: V + W = {V + w: wGW) These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into mutually disjoint subsets. v + w Example 10.7: Let W be the subspace of R2 defined by W = {(a, b): a=b} That is, W is the line given by the equation x — y = 0. We can view V + W as a, translation of the line, obtained by adding the vector v to each point in W. As noted in the diagram on the right, v +W is also a line and is parallel to W. Thus the cosets of W in R2 are precisely all the lines parallel to W. In the next theorem we use the cosets of a subspace W of a vector space V to define a new vector space; it is called the quotient space ofVhyW and is denoted by V/W. Theorem 10.15: Let W he a subspace of a vector space over a field K. Then the cosets of WinV form a vector space over K with the following operations of addi- tion and scalar multiplication: (i) {u + W) + {v + W) = {u + v) + W (ii) kiu + W) = ku + W, where kGK. We note that, in the proof of the above theorem, it is first necessary to show that the operations are well defined; that is, whenever u + W = u' + W and v + W = v' + W, then (i) {u + v) + W = (u' + V') + W and (ii) ku+W - ku' + W, for any k&K 230 CANONICAL FORMS [CHAP. 10 In the case of an invariant subspace, we have the following useful result. Theorem 10.16: Suppose W is a subspace invariantunder a linear operator^ T : V -» V. Then T induces a linear operator f on V/W defined by T{v -\-W) = T{v) + W. Moreover, if T is a zero of any polynomial, then so is T. Thus the minimum polynomial of T divides the minimum polynomial of T. Solved Problems INVARIANT SUBSPACES 10.1. Suppose T:V -^V is linear. Show that each of the following is invariant under T: (i) {0}, (ii) V, (iii) kernel of T, (iv) image of T. (i) We have T(Q) = G {0}; hence {0} is invariant under T. (ii) For every v G. V, T(v) € V; hence V is invariant under T. (iii) Let u e Ker T. Then ^(m) = S Ker T since the kernel of T is a subspace of V. Thus Ker T is invariant under T. (iv) Since T{v) G Im T for every v eV, it is certainly true if v G Im T. Hence the image of T is invariant under T. 10.2. Suppose {Wi} is a collection of T-invariant subspaces of a vector space V. Show that the intersection W= HiWi is also T-invariant. Suppose V eW; then v e Wi for every i. Since Wj is T-invariant, T(v) G Wj for every i. Thus tIv) €:W= riiWi and so W is T-invariant. 10.3. Prove Theorem 10.2: Let T:V-^V be any linear operator and let f{t) be any poly- nomial. Then the kernel of f{T) is invariant under T. Suppose V G Ker/(r), i.e. f(T)(v) = 0. We need to show that ^(i;) also belongs to the kernel of /(r), i.e. f(T)(T(v)) = 0. Since f{t)t=tf(t), we have f(T)T=Tf(T). Thus f(T)T(v) = Tf(T){v) = T(0) = as required. 10.4. Find all invariant subspaces of A - ( J viewed as an operator on R^. First of all, we have that R^ and {0} are invariant under A. Now if A has any other invariant subspaces, then it must be 1-dimensional. However, the characteristic polynomial of A is A(t) = \tI-A\ = t-2 5 -1 t + 2 = t2 + 1 Hence A has no eigenvalues (in R) and so A has no eigenvectors. But the 1-dimensional invariant subspaces correspond to the eigenvectors; thus R2 and {0} are the only subspaces invariant under A. 10.5. Prove Theorem 10.3: Suppose W is an invariant subspace of T:V-^V. Then T has a block diagonal matrix representation ( „ J where A is a matrix representa- tion of the restriction T of T to W. ^ ^ We choose a basis {wi, ....wj of W and extend it to a basis {w^, . . .,Wr,Vi v^} of V. We have CHAP. 10] CANONICAL FORMS 231 A. A. T{W2) = T(W2) = a2iWi + • • ■ + Oar^r T{Wr) — r(Wr) = a^iWi + • • • + ftrr^r ^(i;!) = b^jWi + • ■ • + bi^w^ + Cij-Wj + • • • + c^^v^ T{V<;^ = 621«'l + • • • + hr'^r + C21^1 + • • • + Cas^j ^^(■ys) = 6slWl + • • • + ftsr^r + CjiVi + • • • + Css^s But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of equations. (See page 150.) Therefore it has the form ( ) where A is the transpose of C^ the matrix of coefficients for the obvious subsystem. By the same argument, A is the matrix of T relative to the basis {Wj} of W. 10.6. Let T denote the restriction of an operator T to an invariant subspace W, i.e. T{w) — T{w) for every w GW. Prove: (i) For any polynomial f(t), /(f)(w) = fiT)(w). (ii) The minimum polynomial of T divides the minimum polynomial of T. (i) If /(*) = or if f{t) is a constant, i.e. of degree 1, then the result clearly holds. Assume deg/ = n > 1 and that the result holds for polynomials of degree less than n. Suppose that /{*) = a„t" + a„_i f»-i + • • • + ajt + oo Then f(T){w) = (a„r» + o„_ir"-i + • • • + ao/)(w) = (a„h-i)(T(w)) + (a„_i rn-i + . . . + aoI)(w) = (a„r»-i)(r(w)) + (o„_ir«-i + ■ • • + oo/)(w) = fiTHw) (ii) Let m(t) denote the minimum polynomial of T. Then by (i), m(T)(w) = m{T)(w) = 0(to) = for every w S W; that is, T is a zero of the polynomial m(t). Hence the minimum polynomial of T divides m{t). INVARIANT DIRECT-SUM DECOMPOSITIONS 10.7. Prove Theorem 10.4: Suppose Wi, . . .,Wr are subspaces of V and suppose, for i = l, . . .,r, {wii, . . ., Wi„^} is a basis of Wu Then V is the direct sum of the Wi if and only if the union , . „ ,, B = {Wu, . . . , Win,, . . . , Wrl, . . ., Wm.) IS a basis of V. Suppose B is a basis of V. Then, for any v &V, V = duWii + • • • + ai„jWi„j + ■ • • + a^iW^i + • • • + a^w^^ = Wi + W2+ ■•■ + w^ where Wj = a-ai^n + ■ • • + ai„.Wi„. G PTj. We next show that such a sum is unique. Suppose V — w'l + W2 + ■ • ■ + w'r where W; S Wi Since {wji, . . . , Win.} is a basis of Wi, w[ = b^Wn + • • • + 6i„.W{„. and so V = 611W11 + • • ■ + 6i„jWi„^ + • • • + ftrl^rl + ■ • ■ + Kn^^m^ Since B is a basis of V, Oy = 6y, for each i and each j. Hence w^ = w,' and so the sum for v is unique. Accordingly, V is the direct sum of the PFj. Conversely, suppose V is the direct sum of the W^. Then for any v GV, v = Wj + • • • + w, where Wj G PFj. Since {Wy.} is a basis of Wi, each w^ is a linear combination of the Wy. and so v is a linear combination of the elements of B. Thus B spans V. We now show that B is linearly independent. Suppose "llWli + • • • + «!„ Win, + ■ ■ • + an^ri + 232 CANONICAL FORMS [CHAP. 10 Note that aa«'ii + • • • + ain.Wm. G W'j. We also have that = + 0+---+0 where G Wi- Since such a sum for is unique, aaWti + • • ■ + ai„.Wi„. = for i = 1, . . . , r The independence of the bases {wy.} imply that all the a's are 0. Thus B is linearly independent and hence is a basis of V. 10.8. Suppose T:V^V is linear and suppose T = Ti © ^2 with respect to a T-invariant direct-sum decomposition V = U ®W. Show that: (i) m{t) is the least common multiple of mi{t) and m2{t) where m{t), mi{t) and in2{t) are the minimum polynomials of T, Ti and T2 respectively; (ii) A{t) - Ai{t) A2{t), where A{t), Ai(t) and A2(t) are the characteristic polynomials of T, Ti and T2 respectively. (i) By Problem 10.6, each of TOi(t) and m^it) divides m(t). Now suppose f{t) is a multiple of both Wi(() and m2(t); then f{Ti){U) = and /(r2)(W) = 0. Let vGV; then v - u + w with M e C/ and w G W. Now /(T) V = /(r) w + f(T) w = /(Ti) M + /(T2) w = + = That is, r is a zero of f(t). Hence m(t) divides f{t), and so m{t) is the least common multiple of Wi(t) and m2{t). (ii) By Theorem 10.5, T has a matrix representation M = ( j where A and B are matrix representations of T^ and T2 respectively. Then, by Problem 9.66, tl - A A(t) = \tI-M\ = as required. tl - B = \tI-A\\tI-B\ = Ai(t)A2(t) 10.9. Prove Theorem 10.7: Suppose T:V-*V is linear, and suppose f{t) = g{t) h{t) are polynomials such that /(T) = and g{t) and h{t) are relatively prime. Then V is the direct sum of the T-invariant subspaces U and W where C7 = Kerflr(r) and W = Kerh{T). Note first that U and W are T-invariant by Theorem 10.2. Now since g(t) and h(t) are relatively prime, there exist polynomials r(t) and s(t) such that r(t) sr(t) + s{t) h(t) = 1 Hence for the operator T, r(T) g(T) + s{T) h(T) = I (*) Let veV; then by (*), v = r(T) g(T) v + 8(T) h(T) v But the first term in this sum belongs to W — KerhCT) since h(T) r{T) g(T) V - r(T) g{T) h(T) v = r(T)f(T)v = r(T)(iv = Similarly, the second term belongs to U. Hence V is the sum of U and W. To prove that V = U ®W, we must show that a sum v — u + w with u&V, w &W, is uniquely determined by 1;. Applying the operator r{T)g(T) to v = m + w and using g(T)u = 0, we obtain ,„ __ r(r)ff(r)i> = r(T)g(T)u + r(r)fl'(r)w = r(r)s'(r)w Also, applying (*) to w alone and using h{T) w = 0, we obtain w = r(r)flr{r)w + 8(T)h(T)w = r(2')flr(T)w Both of the above formulas give us w = t(T) g(T) v and so w is uniquely determined by v. Sim- ilarly M is uniquely determined by v. Hence V — V @W, as required. CHAP. 10] CANONICAL FORMS 233 10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f{t) is the minimal poly- nomial of T (and g{t) and h{t) are monic), then g{t) is the minimal polynomial of the restriction Ti of T to U and h(t) is the minimal polynomial of the restriction Tz of rto W. Let mi(t) and mgCf) be the minimal polynomials of T^ and T2 respectively. Note that 9(Tj) = and h(T2) = because U = Ker g(T) and W = Kerh(T). Thus mi(t) divides g(t) and m2{t) divides h{t) (1) By Problem 10.9, f{t) is the least common multiple of mi(t) and nizit). But mi{t) and m2(t) are relatively prime since g{t) and h{t) are relatively prime. Accordingly, f(t) = mj(t) m,2(t). We also have that f{t) — g(t) h(t). These two equations together with (1) and the fact that all the polynomials are monic, imply that g(t) — •mi(t) and h{t) = m^^t), as required. 10.11. Prove the Primary Decomposition Theorem 10.6: Let T : 7 -» F be a linear operator with minimal polynomial -mit) = /l(i)"i/2(i^.../r(<)"' where the fi{t) are distinct monic irreducible polynomials. Then V is the direct sum of T-invariant subspaces Wi, ...,Wr where Wi is the kernel of fi{TY\ Moreover, /i(i)"' is the minimal polynomial of the restriction of T to Wu The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been proved for r — 1. By Theorem 10.7 we can write V as the direct sum of T-invariant subspaces W^ and Fi where W^ is the kernel of /i(r)"i and where V^ is the kernel of fziT)"^ . . . /r(r)"'. By Theorem 10.8, the minimal polynomial of the restrictions of T to TFj and Vi are respectively /i(f)"i and/2(«)"2 ... /r («)"'. Denote the restriction of T to V^ by T^. By the inductive hypothesis, V^ is the direct sum of subspaces W2, ■ ■ .,'W^ such that 'W^ is the kernel of /{(Ti)". and such that /((<)»! is the minimal poly- nomial for the restriction of T^ to PT,-. But the kernel of fi{T)"i, for i = 2, . . .,r is necessarily contained in V^ since /;(*)"' divides /2(t)"2 . . . /r(*)"''- Thus the kernel of /i(r)»i is the same as the kernel of fi{T^^i, which is W^. Also, the restriction of T to W^ is the same as the restriction of T^ to Wi (for i = 2, . . .,r); hence /;(*)"» is also the minimal polynomial for the restriction of T to Wi- Thus V = Wi®W2® ■■■ ®Wr is the desired decomposition of T. 10.12. Prove Theorem 10.9: A linear operator T.V^V has a diagonal matrix representa- tion if and only if its minimal polynomial m{t) is a product of distinct linear polynomials. Suppose m{t) is a product of distinct linear polynomials; say, m{t) = (t-Xi){t-X2) ... (t-X,.) where the Xj are distinct scalars. By the primary decomposition theorem, V is the direct sum of subspaces Wi,...,Wr where Wj = Ker(7'-Xi/). Thus ii v e Wi, then (T-\iI){v) = or T(v) — Xj-y. In other words, every vector in TFj is an eigenvector belonging to the eigenvalue Xj. By Theorem 10.4, the union of bases for Wi, . . ., W^ is a basis of V. This basis consists of eigenvectors and so T is diagonalizable. Conversely, suppose T is diagonalizable, i.e. V has a basis consisting of eigenvectors of T. Let Xj, . . ., Xj be the distinct eigenvalues of T. Then the operator f(T) = {T-\J)(T-X2l)...(T-Kl) maps each basis vector into 0. Thus f{T) = and hence the minimum polynomial m(() of T divides the polynomial m = (t-Xi)(i-X2)...(t-X,/) Accordingly, m,(t) is a product of distinct linear polynomials. 234 CANONICAL FORMS [CHAP. 10 NILPOTENT OPERATORS, JORDAN CANONICAL FORM 10.13. Let T:V^V be linear. Suppose, for vGV, T''{v) = but f'-^v) ¥- 0. Prove: (i) The set S = {v, T{v), ..., T'^-^iv)} is linearly independent. (ii) The subspace W generated by -S is T-invariant. (iii) The restriction T of T to W is nilpotent of index k. (iv) Relative to the basis {T''-^{v), . . .,T{v),v} of W, the matrix of T is of the form 1 .. . 1 .. . .. . 1 .. . Hence the above /c-square matrix is nilpotent of index k. (i) Suppose av + di T{v) + 02 T^v) + + a^.^n-Hv) (*) Applying T'^-i to (*) and using r'=(i;) = 0, we obtain aT'<^-i(v) = 0; since Ti'-'^(v) ^ 0, a - 0. Now applying T^-z to (*) and using P'iv) = and a = 0, we find a^ r'=-i(i;) = 0; hence «! = 0. Next applying T''-^ to (*) and using T<^(v) = and a = ai = 0, we obtain a2T^~^{v) = 0; hence Og = 0. Continuing this process, we find that all the a's are 0; hence S is independent. (ii) Let veW. Then V = bv + biT(v) + biT^v) + ••■ + b^_iT'^-Hv) Using THv) = 0, we have that T{v) = bT{v) + biT2(v) + ■•• + b^.^^T'^-H'") ^ W Thus W is T-invariant. (iii) By hypothesis T''{v) = 0. Hence, for i = k—1, Tk{Ti(v)) = r'' + «(i;) = That is, applying T'^ to each generator of W, we obtain 0; hence T'^ = and so T is nilpotent of index at most fc. On the other hand, Tf^-^v) = T''-^v) ¥= 0; hence T is nilpotent of index exactly fc. (iv) For the basis {T'<'-^v), Ti'-^v), . . .,T{v),v} of W, T(T^-^(v)) = rk(i;) = r(rfc-3(^)) = r'=-2(-u) T(T{y)) T(v) Hence the matrix of T in this basis is T'^(v) T(v) 1 . . 1 . . . . 1 . . CHAP. 10] CANONICAL FORMS 235 10.14. Let T-.V-^V be linear. Let U = KerT' and W = KerT+\ Show that (i) UcW, (ii) T{W) C U. (i) Suppose ueU = Kern Then THu) = and so T'+Mm) = T(,Ti(u)) = T(0) = 0. Thus MGKerr* + i = W. But this is true for every m G f/; hence UcW. (ii) Similarly, if w G W' = Ker r*+i, then T'+Mw) = 0. Thus r'+Mw) = r*(r(w)) = r«(0) = and so r(W') c U. 10.15. Let r : F ^ F be linear. Let X = Ker r*-^ Y = Ker 7*-^ and Z = Ker T*. By the preceding problem, XcY cZ. Suppose {Ml, .... Mr}, {Ml, . . . , Mr, Vi, . . . , Vs}, {Mi, . . . ,Ur, Vi, . . . , Vs, Wi, . . . , Wt} are bases of X, Y and Z respectively. Show that s = {Ml, . . ., Mr, r(wi), . . ., r(M;t)} is contained in Y and is linearly independent. By the preceding problem, T(Z) c Y and hence S CY. Now suppose S is linearly dependent. Then there exists a relation aiUi + • • • + a^Mr + 6i T(wi) + ■■■ + b^ T{wt) = where at least one coefficient is not zero. Furthermore, since {u^} is independent, at least one of the 6fc must be nonzero. Transposing, we find bi T{wi) + • • • + 6t T(wt) = - aiUi - ■•■ - a^u^ e X = Ker P'^ Hence Ti-^(biT(wi) + • ■ ■ + btT(wt)) = Thus r->(6iWi + ■•• + 6tWt) = and so 5iWi + • • • + 6,Wt G r = KerT*-! Since {mj, Vj} generates Y, we obtain a relation among the Mj, i»j and Wj; where one of the coefficients, i.e. one of the 6^, is not zero. This contradicts the fact that {Mj, Vj, w^} is independent. Hence S must also be independent. 10.16. Prove Theorem 10.10: Let T.V^V be a nilpotent operator of index k. Then T has a block diagonal matrix representation whose diagonal entries are of the form N There is at least one N of order k and all other N are of orders ^ k. The number of N of each possible order is uniquely determined by T. Moreover, the total number of N of all orders is the nullity of T. Suppose dimy = n. Let Wi = Ker T, W2 = Ker ra W^ = Ker T". Set m^ = dim W^j, for i — 1,.. .,k. Since T is of index k, W^ = V and Wj^-i # V and so m^_i <m^ — n. By Problem 10.17, WiCW^C ••• CW^ = V Thus, by induction, we can choose a basis {mj, . . .,m„} of V such that {u^, . . .,«„ > is a basis of PFj. We now choose a new basis for V with respect to which T has the desired form. It will be con- venient to label the members of this new basis by pairs of indices. We begin by setting v{l,k) = u^^_^ + i, ■w(2, fc) =M„^_j + 2, ..., •y(mfc-«tfc_i, fc) =M„j^ 1 . . 1 . . . . 1 . . 236 CANONICAL FORMS [CHAP. 10 and setting v{l,k-l) = Tv{l,k), v{2,k-l) = Tv(2,k), ..., vim^-m^.-i, k-1) ^ Tv{m^-m^^i,k) By the preceding problem, Si = {Ml ..., u^^_^, v{l,k-l), ..., vCmfc-mfc^i, fe-1)} is a linearly independent subset of W^-i- We extend Sj to a basis of Wfc-i by adjoining new ele- ments (if necessary) which we denote by ■y(mfe-m;,_i + l, fc-1), v(m^-mk-i + 2, k~l), ..., v(m^_i-- m^^^tk-V) Next we set v(l, k-2) = Tv(l, k - 1), v(2, k-2) = Tv{2, k-1), ..., v(in^_i - mfc_2. k-2) = Tv(m^_i - m^.g, fc - 1) Again by the preceding problem, Si = {Ml, ..., u^^_^, v{l,k-2), ..., ■u(mfc_i-w^^2. ^^-2)} is a linearly independent subset of W|c-2 which we can extend to a basis of TFfc_2 by adjoining elements vim^-.i- 711^-2 + 1, k-2), ■y(mfc_i-mfc_2+2, fc-2), ..., vim^^^-ink-s, ^-Z) Continuing in this manner we get a new basis for V which for convenient reference we arrange as follows: v{l, k), ..., 'y(mfc — ■TOfc_i, A;) v{l,k-l), ..., v{m^-mk_i,k-l), ..., •u(mfc_i - Wfc-a, fe - 1) i;(l, 2), ..., v(mfc-mfe_i, 2), ..., •u(mfc_i - ■mfc_2, 2), ..., ^(ma-mi, 2) v(l, 1), ..., i;(mfc-mfc_i, 1), ..., •u(mfe_i -mfe_2, 1), ..., i;(m2-mi, 1), ..., v(mi, 1) The bottom row forms a basis of Wi, the bottom two rows form a basis of W2, etc. But what is important for us is that T maps each vector into the vector immediately below it in the table or into if the vector is in the bottom row. That is, (v{i, j — 1) for } > 1 Tv(i,i) = ^ . ■ . for ) = 1 Now it is clear (see Problem 10.13(iv)) that T will have the desired form if the v{i,]) are ordered lexicographically: beginning with v(l, 1) and moving up the first column to ^(l, k), then jumping to v{2, 1) and moving up the second column as far as possible, etc. Moreover, there will be exactly m^, — mfc_i diagonal entries of order k (mfc_i — m;;_2) — (m^ — mfc_i) = 2mfc_i — m^ — Wfc_2 diagonal entries of order fc — 1 2m2 — mi — m^ diagonal entries of order 2 2mi — 7^2 diagonal entries of order 1 as can be read off directly from the table. In particular, since the numbers mj, . . . , m^ are uniquely determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the identity ■mi = (mfc-mfc_i) + (2mfc_i -m^ -TOfc-a) + ••• + (2m2-mi-m3) + (2mi-m2) shows that the nullity mj of T is the total number of diagonal entries of T. 1 1 1\ /O 1 1 1 0011l\ /ooooo 10.17. Let A = and A3 = 0; Then A^ = 00000/ looooo ^00000/ \0 0/ hence A is nilpotent of index 2. Find the nilpotent matrix M in canonical form which is similar to A. CHAP. 10] CANONICAL FORMS 237 Since A is nilpotent of index 2, M contains a diagonal block of order 2 and none greater than 2. Note that rank A - 2; hence nullity of A = 5 - 2 = 3. Thus M contains 3 diagonal blocks. Accordingly M must contain 2 diagonal blocks of order 2 and 1 of order 1; that is, 1 j _0_^j^_0^ M= OOlOllo |_0_^ '_0 I 10.18. Prove Theorem 10.11, page 226, on the Jordan canonical form for an operator T. By the primary decomposition theorem, T is decomposable into operators T-^, . . ., T^, i.e. r = Ti © • • • © r^, where (t — Xj)"»i is the minimal polynomial of Tj. Thus in particular, (Ti - Xi/)"«, = 0, ..., (T^-\J)r«-r = (i Set Ni = Ti- Xj7. Then for i = l r, Ti = Ni+ \I, where Nr't = That is, Tj is the sum of the scalar operator Xj/ and a nilpotent operator iV{, which is of index mj since (t — Xj)*"! is the minimal polynomial of Tf. Now by Theorem 10.10 on nilpotent operators, we can choose a basis so that iVj is in canonical form. In this basis, Ti = N^ + \I is represented by a block diagonal matrix Mj whose diagonal entries are the matrices J^. The direct sum J of the matrices Mj is in Jordan canonical form and, by Theorem 10.5, is a matrix representation of T, Lastly we must show that the blocks Jy satisfy the required properties. Property (i) follows from the fact that A^j is of index mj. Property (ii) is true since T and J have the same character- istic polynomial. Property (iii) is true since the nullity of Ni= Tj — \I is equal to the geometric multiplicity of the eigenvalue Xj. Property (iv) follows from the fact that the Tj and hence the N^ are uniquely determined by T. 10.19. Determine all possible Jordan canonical forms for a linear operator T:V ->V whose characteristic polynomial is A{t) — (t — 2)^(t — 5)^. Since t — 2 has exponent 3 in A(t), 2 must appear three times on the main diagonal. Similarly 5 must appear twice. Thus the possible Jordan canonical forms are 2 11 2 1 2 (i) 5 1 5 2 I 1 2 I !_■ (ii) 5 1 (iii) 5 1 5 2 1 2 1 2 2 1 I 2,' L?.i I 5 5 1 — -( I 5 (iv) (V) (vi) 238 CANONICAL FORMS [CHAP. 10 10.20. Determine all possible Jordan canonical forms / for a matrix of order 5 whose minimal polynomial is in(t) — {t — 2y. J must have one Jordan block of order 2 and the others must be of order 2 or 1. Thus there are only two possibilities: 2 1 I I 2 1 I J = .i^_. I 2 1 I ' 2 i ^ I !__ + __ I 2 Note that all the diagonal entries must be 2 since 2 is the only eigenvalue. QUOTIENT SPACE AND TRIANGULAR FORM 10.21. Let W he a subspace of a vector space V. Show that the following are equivalent: (i) uGv + W, (ii) u-v GW, (iii) v Gu + W. Suppose uG v + W. Then there exists w^eW such that u = v + Wq. Hence u — v = WoSW. Conversely, suppose u — vGW. Then u — v = Wq where Wq S 1^. Hence u = v + WgSv + W. Thus (i) and (ii) are equivalent. We also have: u — vGW iff — (m — v)=v — uGW iflf v & u+ W. Thus (ii) and (iii) are also equivalent. 10.22. Prove: The cosets of PF in V partition V into mutually disjoint sets. That is: (i) any two cosets u + W and v + W are either identical or disjoint; and (ii) each v gV belongs to a coset; in fact, v Gv + W. Furthermore, u + W — v + W if and only if u — vGW, and so {v + w) + W = v + W for any w GW. Let 1) e V. Since G W, we have v = v + E v + W which proves (ii). Now suppose the cosets u+W and v + W are not disjoint; say, the vector « belongs to both u+W and v + W. Then u — xGW and x — vGW. The proof of (i) is complete if we show that u + W = v + W. Let M + Wq be any element in the coset u+W. Since u — x, x — v and Wq belong to W, (u + Wq) — V = (u — x) + {x — v) + Wo S W Thus u + W(,Gv + W and hence the coset u+W is contained in the coset v + W. Similarly v + W is contained in u+ W and so u+ W = v + W. The last statement follows from the fact that u+W - v + W if and only if uGv + W, and by the preceding problem this is equivalent to u — v G W. 10.23. Let W be the solution space of the homo- geneous equation 2x + By + 4:Z = 0. De- scribe the cosets of W in R^. TF is a plane through the origin O = (0, 0, 0), and the cosets of W are the planes parallel to W. Equivalently, the cosets of W are the solution sets of the family of equations 2x + Sy + 4z = k, kGR In particular the coset v + W, where v = (a, b, c), is the solution set of the linear equation 2x + Sy + Az = 2a + 36 + 4c or 2(x -a) + 3(y - 6) + 4(2 - c) = CHAP. 10] CANONICAL FORMS 239 10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem 10.15, page 229, are well defined; namely, show that if u + W = u' + W and v + W - v' + W, then (i) {u + v) + W = {u' + V') + W and (ii) ku + W = ku' + W, for any k&K (i) Since u + W ^ u' + W and v + W = v' + W, both u — u' and v — v' belong to W. But then (u + v) - (u' + v') - {u- u') + {v- v') e W. Hence (u + v) + W = (m' + v') + W. (ii) Also, since u — u' S W implies k(u — u') G W, then ku — ku' = k(u — u') G W; hence ku+W = ku' + W. 10.25. Let F be a vector space and W a subspace of V. Show that the natural map ij : F -» V/W, defined by rj{v) = v + W, is linear. For any u,v ^ V and any k G K, we have v{u + v) = u + V + W - u + W + V + W = v{u) + v{v) and v{kv) = kv + W = k(v + W) = k ri(v) Accordingly, r) is linear. 10.26. Let W he a subspace of a vector space V. Suppose {wi, . . . , Wr} is a basis of W and the set of cosets {vi, . . . , Vs}, where Vj = Vj + W, is a basis of the quotient space. Show that B = {vi, . . .,Vs, Wi, . . . , Wr} is a basis of V. Thus dim V = dim W + dim (7/TF). Suppose M e y. Since {■5^} is a basis of V/W, u = u + W — di'i'i + a.2'U2 + • • ■ + ttj^s Hence u — aiVy + • • • + a^v^ + w where w G W. Since {w;} is a basis of W, u — a^Vi + • • • + a^Vg + bjWi + • • ■ + b^w^ Accordingly, B generates V. We now show that B is linearly independent. Suppose e^Vi + • ■ ■ + CgVs + djWi + • • • + dfWr = (1) Then Cj'Di + ■ • • + c^Vs - = W Since {Vj} is independent, the c's are all 0. Substituting into (1), we find djWi + ■ • • + d^w^ = 0. Since {wj is independent, the d's are all 0. Thus B is linearly independent and therefore a basis of y. 10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator T:V^V. Then T induces a linear operator f on V/W defined by f{v + PF) = T{v) + W. Moreover, if T is a zero of any polynomial, then so is T. Thus the mini- mum polynomial of T divides the minimum polynomial of T. We first show that f is well defined, i.e. if u+W = v + W then t(u+W) = f(v + W). If u+W = v + W then u-vGW and, since W is T-invariant, T(u -v) = T(u) - T{v) G W. Accordingly, T{u+W) = T{u) + W = T(v) + W = T(v + W) as required. We next show that t is linear. We have t{{u +W) + (v + W)) = f(u + v + W) = T(u + v) + W = T(u) + Tiv) + W = T(u) + W+ T(v) + W = f{u + W) + T(v + W) and f{k{u + W)) = f(ku + W) = T{ku) + W= kT(u) + W = k{T(u) + W) = kf(u+ W) Thus f is linear. 240 CANONICAL FORMS [CHAP. 10 Now, for any coset m + IF in VIW, f2(u+W) = THu) + W = T(T(u)) + W = t{T{u) + W) ^ t(f{u+W)) = t^u+W) Hence T^ — T^. Similarly T" = T" for any n. Thus for any polynomial /(«) = a„t" + • • • + ao = 2 afi, HT)(u+W) = f(T)(u) + W = '^.a^Tiiu) + W = "2 diiTKiA + '^) = ^ajFCw+W) = ^aifi(u+W) = {^ a.ifi)(u + W) = f{f)(u+W) and so 7(r) = /(r). Accordingly, if T is a root of f{t) then 7(f) = = W = /(f), i.e. f is also a root of f(t). Thus the theorem is proved. 10.28. Prove Theorem 10.1: Let T .V -^V be a linear operator whose characteristic poly- nomial factors into linear polynomials. Then V has a basis in which T is represented by a triangular matrix. The proof is by induction on the dimension of V. If dim V = 1, then every matrix representa- tion of r is a 1 by 1 matrix which is triangular. Now suppose dim V — n> 1 and that the theorem holds for spaces of dimension less than n. Since the characteristic polynomial of T factors into linear polynomials, T has at least one eigen- value and so at least one nonzero eigenvector v, say T(v) — a^^v. Let W be the 1-dimensional sub- space spanned by v. Set V = VIW. Then (Problem 10.26) dim V = dim V — dim W = to - 1. Note also that W is invariant under T. By Theorem 10.16, T induces a linear operator f on V whose minimum polynomial divides the minimum polynomial of T. Since the characteristic polynomial of r is a product of linear polynomials, so is its minimum polynomial; hence so are the minimum and characteristic polynomials of f. Thus V and f satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis {v^, . . . , ■0„} of V such that f(vs) = as2.V2 + assVs Now let V2, ■ ■ -tVn be elements of V which belong to the cosets 1)2, . . . , «„ respectively. Then {v,V2, ...,vj is a basis of V (Problem 10.26). Since f(v2) - a22'V2, we have f («2) — a'22^2 = and so 2X^2) "" «22'"2 ^ ^ But W is spanned by v; hence T{v2) — a22'''2 is a multiple of v, say T(V2) — a'22V2 — a2i'" and so T^kv^) — "21^ + «22'"2 Similarly, for t = 3 n, T{Vi) - ai2V2 - Ojs-ys - • • • - a^Vi e W and so T{Vi) = a^v + 0(2^2 -!-•••+ a^iV^ Thus T(v) = a,iv T{V2) = a2iV + 0.22^2 T(Vn) = a„iV + a„2«2 + ■ • • + ann-yn and hence the matrix of T in this basis is triangular. CYCLIC SUBSPACES, RATIONAL CANONICAL FORM 10.29. Prove Theorem 10.12: Let Z{v, T) be a T-cyclic subspace, T^ the restriction of T to Z{v,T), and m„(«) ^ t^ + 0.^-1*"-' + ■ ■ ■ + Oo the T-annihilator of v. Then: (i) The set {v, T{v), ..., r'=-i(v)} is a basis of Z{v,T); hence dimZ(t;,r) = k. (ii) The minimal polynomial of Tv is TO„(f). (iii) The matrix of T« in the above basis is CHAP. 10] CANONICAL FORMS 241 . . — fto 1 . . -ai . . — afc-2 . . 1 — Ctlc-l (i) By definition of m^Ct), T''{v) is the first vector in the sequence v, T{v), T^v), . . . which is a linear combination of those vectors which precede it in the sequence; hence the set B — {v, T{v), . . ., r''-i(i;)} is linearly independent. We now only have to show that Z(v, T) = L(B), the linear span of B. By the above, T^v) e L{B). We prove by induction that T^{v) &L(B) for every n. Suppose n>k and T^-^(v) E. L(B), i.e. !r"-i(v) is a linear com- bination of V, . ..,T^-i{v). Then r"(v) = r(r«-i(v)) is a linear combination of T{v), . . -tT^v). But THv) G L{B); hence T^{v) £ L(B) for every n. Consequently f(T)(v) G L(B) for any polynomial /(<). Thus Z{v, T) = L(B) and so B is a basis as claimed. (ii) Suppose m(t) = i* + 6j_i<«~i + • • • + &o is the minimal polynomial of r„. Then, since ■we (v, ), ^ ^ m(T^){v) = m(T){v) = T^(v) + h^^iT^--^(v) + • • ■ + h^v Thus T^{v) is a linear combination of v,T{v), . . ., T^-i{v), and therefore k ^ s. However, m„(r) = and so m^(T^) = 0. Then m(t) divides m„(t) and so s — k. Accordingly fc = s and hence w„(t) = tn{t). (iii) T„(v) = T(v) T„{T{v)) = THv) T„{T''-Hv)) = T^v) = -oov - a^T{v) - a^n(v) a^^^n'Hv) By definition, the matrix of T„ in this basis is the transpose of the matrix of coefficients of the above system of equations; hence it is C, as required. 10.30. Let r : y -» y be linear. Let TF be a T-invariant subspace of V and T the induced operator on VIW. Prove: (i) The T-annihilator of v G V divides the minimal poly- nomial of T. (ii) The T-annihilator of v G VIW divides the minimal polynomial of T. (i) The r-annihilator of v GV is the minimal polynomial of the restriction of T to Z(v, T) and therefore, by Problem 10.6, it divides the minimal polynomial of T. (ii) The f-annihilator of ■p S VIW divides the minimal polynomial of f, which divides the minimal polynomial of T by Theorem 10.16. Remark. In case the minimal polynomial of T is /(t)" where f(t) is a monic irreducible poly- nomial, then the T-annihilator of v G V and the T-annihilator of i) G VIW are of the form /(t)™ where m — n. 10.31. Prove Lemma 10.13: Let T : F -* V be a linear operator whose minimal polynomial is /(t)" where f{t) is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces Zi = Z{vi, T), i = l, . . ., r, with corresponding T-annihilators /(f)"i, /{f)% . . . , /(i)"', n = ni ^ »2 - • • • - «r Any other decomposition of Y into the direct sum of T-cyclic subspaces has the same number of components and the same set of T-annihilators. The proof is by induction on the dimension of V. If dim V = 1, then V is itself T-cyclic and the lemma holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of dimension less than that of V. 242 CANONICAL FORMS [CHAP. 10 Since the minimal polynomial of T is /(<)", there exists v^GV such that f{T)^~i(vi) ¥= 0; hence the r-annihilator of Vi is /(«)". Let Zi = Z{vi,T) and recall that Zi is T-invariant. Let V = V/Zi and let f be the linear operator on V induced by T. By Theorem 10.16, the minimal poly- nomial of f divides /(<)"; hence the hypothesis holds for V and f. Consequently, by induction, V is the direct sum of f-cyclic subspaces; say, V = Z(%, r) e • • • © Z(v„ f) where the corresponding f-annihilators are f{t)"2, . . ., /(*)">■, ti — n2 — • • • — n^ We claim that there is a vector V2 in the coset V2 whose T-annihilator is /(<)"2, the T-annihilator of V2. Let w be any vector in Dg- Then f{T)"i (w) e Z^. Hence there exists a polynomial g(t) for which f(T)n.{w) = 9{T){Vi) (1) Since /(<)" is the minimal polynomial of T, we have by (1), = f{T)^{w) = f(T)«-n2g{T)(vi) But /(«)" is the r-annihilator of 1^1 ; hence /(i)" divides f(t)«-''2 g(t) and so g{t) = f(t)«2 h(t) for some polynomial h(t). We set , ,„r^ , > ■U2 = w - h{T) (vi) Since w — ■Uj = 'i(^) (^1) ^ ^1. ^2 also belongs to the coset ■82- Thus the T-annihilator of V2 is a multiple of the if-annihilator of V2- On the other hand, by (1), f{T)«^{v2) = f(T)'H(w-h(T)(vi)) = f(T)^(w) - g{T){vi) = Consequently the T-annihilator of v^ is /(t)"2 as claimed. Similarly, there exist vectors v^, . . .,Vf&V such that ViGvl and that the T-annihilator of Vj is /(t)"i, the f-annihilator of ^. We set Z2 = Z{V2,T), ..., Z, = Z(i;„r) Let d denote the degree of /(t) so that /(*)"« has degree dwj. Then since /(t)"! is both the r-annihilator of Vi and the f-annihilator of v^, we know that {Vi, T(v,), ..., T*"i- 1 (Vi)} and {% f(v^, . . . , f d^ii- 1 (iTj)} are bases for Z(Vi, T) and Z(v^, f) respectively, for i = 2,...,r. But V = Z(v^ f) © ••• © Z(vZ, t); hence . .s^ . ,, is a basis for V. Therefore by Problem 10.26 and the relation fi(v) = THv) (see Problem 10.27), {Vi, ..., rd"i-l(Vi), V2, .... Tdr^^-HV2), ...,V„ ..., Td-r-l (i;^)} is a basis for V. Thus by Theorem 10.4, V = Z(vi, T) ® •■• © Z(v^, T), as required. It remains to show that the exponents Wj, . . . , w^ are uniquely determined by T. Since d denotes the degree of fit), dimy = d{nx^ h n^) and dimZj = drii, i = l,...,r Also, if s is any positive integer then (Problem 10.59) f(T)^(Z>i is a cyclic subspace generated by f{T)s(Vi) and it has dimension d(Mj-s) if «i > s and dimension if Wi - s. Now any vector v &V can be written uniquely in the form v = w-^+ • • • + Wr where w^ £ Zj. Hence any vector in f(T)^{V) can be written uniquely in the form f(T)Hv) = f{Ty(Wi)+ ■■■ +f(T)s(w,) where /(r)«(Wj) € f(Ty{Zi). Let t be the integer, dependent on s, for which %1 > S, ..., Jlt > S, Wt + i - s Then f(Ty{V) = f(T)HZi) © ••• © /(DM^t) and so dim(/(r)'(V)) = d[{ni - s) + • ■ ■ + {n^ - s)] (*) The numbers on the left of (*) are uniquely determined by T. Set s = re — 1 and (*) determines the number of TOj equal to re. Next set s = re — 2 and (*) determines the number of re, (if any) equal to n-1. We repeat the process until we set s = and determine the number of % equal to 1. Thus the Wi are uniquely determined by T and V, and the lemma is proved. CHAP. 10] CANONICAL FORMS 243 10.32. Let F be a vector space of dimension 7 over R, and let T.V^V be a linear operator with minimal polynomial m{t) = {t^ + 2){t + 3f. Find all the possible rational canonical forms for T. The sum of the degrees of the companion matrices must add up to 7. Also, one companion matrix must be t^ + 2 and one must be (t + 3)3. Thus the rational canonical form of T is exactly one of the following direct sums of companion matrices: (i) C(t2 + 2) © C(t2 + 2) © C((t + 3)3) (ii) C(«2 + 2) © C((t + 3)3) © C((t + 3)2) (iii) C(t2 + 2) © C((« + 3)3) © C(t + 3) © C(t + S) That is, ^0 -2 (i) -2 \ /° -2 \ A -27 -27 1 -27 1 -27 1 -9 -9 / 1 -6/ V 1 -9 -3 -3/ (ii) (iii) PROJECTIONS 10.33. Suppose V = Wi® • • • ® Wr. The projection of V into its subspace Wk is the map- ping E:V ^ V defined by E{v) = Wk where v — wi+ ■ ■ • + Wr, Wi e Wi. Show that (i) E is linear, (ii) E^ = E. (i) Since the sum v = Wi+ ■ ■ ■ + w^, WiG W is uniquely determined by v, the mapping E is well defined. Suppose, for m S V, u ~ w^i- • ■ ■ + w^, w[ S W^. Then V -{- u = (wi + w() + • • • + (Wr + w'r) and A:t; = fcwj + • • • + kw^, kwf, Wj + w,' G PFj are the unique sums corresponding to v + u and kv. Hence E(v + u) = Wk + wl^ = E{v) + E(u) and E(kv) = kw^ = kE(v) and therefore £7 is linear. (ii) We have that w^ = 0+---+0 + TOfc + 0+---+0 is the unique sum corresponding to w^ G Wk'-, hence E(w^) = w^. Then for any v G V, EHv) = E(E(v)) = E(Wk) = w^ = E(v) Thus E^ = E, as required. 10.34. Suppose E:V-*V is linear and E^ - E. Show that: (i) E(u) = u for any uGlmE, i.e. the restriction of E to its image is the identity mapping; (ii) V is the direct sum of the image and kernel of E: V — ImE @ KerE; (iii) E is the projection of V into Im E, its image. Thus, by the preceding problem, a linear mapping T : V -> V is a projection if and only if T^ = T; this characterization of a projection is frequently used as its definition. (i) If M G Im E, then there exists v e V for which E(v) = u; hence E(u) = E{E{y)) = EHv) = E{v) = u as required. (ii) Let V SV. We can write v in the form v = E{v) + v — E(v). Now E(v) e.ImE and, since E(v - ^(1;)) = E(v) - E^(v) = E{v) - E(v) = V — E(v) G Ker E. Accordingly, V = Im jE? + Ker E. 244 CANONICAL FORMS [CHAP. 10 Now suppose w GImE n Ker E. By (i), E{w) = w because w G Im £/. On the other hand, E{w) = because w G Ker E. Thus w = and so ImE n Ker E = {0}. These two conditions imply that V is the direct sum of the image and kernel of E. (iii) Let v eV and suppose v = u + w where uGlmE and w e Ker E. Note that E(u) = u by (i), and E{w) — because w G Kerfi". Hence £?(-!;) = E{u + w) = E(u) + E{w) = u + = m That is, E is the projection of V into its image. 10.35. Suppose V = U®W and suppose T:V-^V is linear. Show that U and TF are both r-invariant if and only if TE - ET where E is the projection of V into U. Observe that E{v) G U for every t> G Y, and that (i) E(v) = v iff v€:U, (ii) K(i;) = iff v&W. Suppose ET = TE. Let u G U. Since E{u) = w, r(M) = T{E(u)) = (TE)(u) = {ET){u) = E(T(u)) G U Hence U is T-invariant. Now let w GW. Since E{w) = 0, E{T(w)) = (ET){w) - (TE){w) = T{E{w)) = r(0) = and so T(w) G W Hence W is also T-invariant. Conversely, suppose U and W are both T-invariant. Let ■« G Y and suppose •y = w + w where M G r and w G T^. Then ?(«) G i7 and r(w) G W; hence S(r{w)) = TM and E(T(w)) = 0. '^^'"^ (ET){v) = (Br)(M + TO) = (Er)(M) + (Br)(w) = E(T(u)) + E{T{w)) = T(u) and {TE)(v) = (r£7)(M + w) = r(S(M + «;)) = T{u) That is, (ET){v) = (TE)(v) for every -y G F; therefore ET = TE as required. Supplementary Problems INVARIANT SUBSPACES 10.36. Suppose W is invariant under T:V -^V. Show that W is invariant under f{T) for any polynomial 10.37. Show that every subspace of V is invariant under I and 0, the identity and zero operators. 10.38. Suppose W is invariant under S:V^V and T : V -^ F. Show that W is also invariant under S + r and ST. 10.39. Let r : y -> y be linear and let W be the eigenspace belonging to an eigenvalue X of T. Show that W is r-invariant. 10.40. Let y be a vector space of odd dimension (greater than 1) over the real field E. Show that any linear operator on V has an invariant subspace other than V or {0}. 10.41. Determine the invariant subspaces of A = f _A viewed as a linear operator on (i) R2, (ii) C^. 10.42. Suppose dim V = n. Show that T:V^V has a triangular matrix representation if and only if there exist T-invariant subspaces WiCWzC ■ ■ ■ cW^ = V for which dim TFfc = fc, k = l,...,n. INVARIANT DIRECT-SUMS 10 43 The subspaces Wi,...,Wr are said to be independent if Wi + • • • -|- w^ = 0, Wj G Wi, implies that each Wi = 0. Show that L(Wd = Wi® ■■■ @Wr if and only if the Wi are independent. (Here LiWi) denotes the linear span of the Wi-) 10.44. Show that V = Wi® ■•■ ®Wr if and only if (i) V = L{Wi) and (ii) W^nL{Wi, . . .,Wk-i. W^ + i,...,Wr) = {0}, fe = l, ...,r. 10.45. Show that L(Wi) = W,®---®Wr if and only if dimLd^i) = dim Wi + • • • -t- dim W^. 1 . . 1 . . . . 1 . . . . 1 . . 1 . . CHAP. 10] CANONICAL FORMS 245 10.46. Suppose the characteristic polynomial of T : V -» V is A(t) = /i(f)"i /2(f)»2 . . . fAt)«' where the fi(t) are distinct monic irreducible polynomials. Let V = WiQ ■■• ®Wr be the primary decom- position of V into r-invariant subspaces. Show that /((t)™! is the characteristic polynomial of the restriction of 7" to PFj. NILPOTENT OPERATORS 10.47. Suppose S and T are nilpotent operators which commute, i.e. ST = TS. Show that S+T and ST are also nilpotent. 10.48. Suppose A is a supertriangular matrix, i.e. all entries on and below the main diagonal are 0. Show that A is nilpotent. 10.49. Let V be the vector space of polynomials of degree - n. Show that the differential operator on V is nilpotent of index n + 1. 10.50. Show that the following nilpotent matrices of order n are similar: and \0 ... 1 0/ 10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4. JORDAN CANONICAL FORM 10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(t) and minimal polynomial m{t) are as follows: (i) A(t) = (t-2)4(f-3)2, m{t) = (t-2)2(t-3)2 (ii) A{t) = (t-7)5, m(t) = (t-7)2 (iii) A(t) = (t-2)7, m(t) = (t-2)3 (iv) A(t) ^ (t-3)*(t-5)\ m(t) =: («-3)2(i-5)2 10.53. Show that every complex matrix is similar to its transpose. {Hint. Use Jordan canonical form and Problem 10.50.) 10.54. Show that all complex matrices A of order n for which A" = / are similar. 10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real entries. CYCLIC SUBSPACES 10.56. Suppose T:V -*V is linear. Prove that Z{v, T) is the intersection of all T-invariant subspaces containing v. 10.57. Let fit) and g{t) be the T-annihilators of u and v respectively. Show that if f(t) and g{,t) are rel- atively prime, then f(t)g(t) is the T-annihilator of u + v. 10.58. Prove that Z(m, T) = Z(v, T) if and only if g(T)(u) = v where g(f) is relatively prime to the r-annihilator of u. 10.59. Let W = Z{v, T), and suppose the T-annihilator of v is /(*)" where f(t) is a monic irreducible poly- nomial of degree d. Show that f{T)^{W) is a cyclic subspace generated by f(Ty(v) and it has dimen- sion d{n — s) if n > s and dimension if n — s. RATIONAL CANONICAL FORM 10.60. Find all possible rational canonical forms for: (i) 6X6 matrices with minimum polynomial m(t) = (t^ + 3){t + 1)2 (ii) 6X6 matrices with minimum polynomial mit) = (t + 1)3 (iii) 8 X 8 matrices with minimum polynomial m(t) = {t^ + 2)^(t + Z)^ 10.61. Let A be a 4 X 4 matrix with minimum polynomial m(t) = (t^ + \){fi — 3). Find the rational ca- nonical form for A if A is a matrix over (i) the rational field Q, (ii) the real field B, (iii) the com- plex field C. 246 CANONICAL FORMS [CHAP. 10 10.62. Find the rational canonical form for the Jordan block 10.63. Prove that the characteristic polynomial of an operator T :V -^ V is a product of its elementary divisors. 10.64. Prove that two 3X3 matrices with the same minimum and characteristic polynomials are similar. 10.65. Let C(f(t)) denote the companion matrix to an arbitrary polynomial f(t). Show that f(t) is the char- acteristic polynomial of C{f(t)). PROJECTIONS 10.66. Suppose V = Wi® ■■■ ®Wr- Let Ei denote the projection of V into Wi- Prove: (i) EiE^ = 0, i^j; (ii) I = E^+ •■■ +E^. 10.67. Let El, .. .,Er be linear operators on V such that: (i) Ef = £/;, i.e. the Ei are projections; (ii) EiEj = 0, i ^ i; (iii) J = Bj H +E^. Prove that V = Im Ej © • • • © Im B^- 10.68. Suppose E -.V ^V is a projection, i.e. E^ = E. Prove that E has a matrix representation of the form ( ^ ) where r is the rank of E and /^ is the r-square identity matrix. 10.69. Prove that any two projections of the same rank are similar. {Hint. Use the result of Problem 10.68.) 10.70. Suppose E -.V -^V is a projection. Prove: (i) I-E isa projection and V = ImE ® Im (I-E); (ii) I + E is invertible (if 1 + 1 # 0). QUOTIENT SPACES 10.71. Let IF be a subspace of V. Suppose the set of cosets {vi + W,V2 + W, ...,■«„+ IF} in V/W is linearly independent. Show that the set of vectors {v^, V2, ..., vj in V is also linearly independent. 10.72. Let IF be a subspace of V. Suppose the set of vectors {mi, Wg m„} in V is linearly independent, and that L(Ui) nW = {0}. Show that the set of cosets {mi + IF, . . . , m„ + IF} in V/W is also linearly independent. 10.73. Suppose V = U ®W and that {mj, . . . , tt„} is a basis of U. Show that {ui + W, . . . , ii„ + IF} is a basis of the quotient space V/W. (Observe that no condition is placed on the dimensionality of V or IF.) 10.74. Let IF be the solution space of the linear equation ajXi + 020:2 + • • • + an^n = 0. O'i^ K and let v = (5i, 63 6„) S X". Prove that the coset v + IF of IF in K" is the solution set of the linear equation , , , , i._„j,j^ .a. „ h aiXi + a2X2 + • • • + a^Xn = b where 6 - ajfti + • • • + a„6„ 10.75. Let V be the vector space of polynomials over R and let IF be the subspace of polynomials divisible by t*, i.e. of the form Ugt* + a^t^ -\ h a„_^t^. Show that the quotient space V/W is of dimension 4. 10.76. Let U and IF be subspaces of V such that WcUcV. Note that any coset «, + IF of IF in ?7 may also be viewed as a coset of IF in F since u&U implies w e V; hence U/W is a subset of V/W. Prove that (i) ?7/IF is a subspace of V/W, (ii) dim {V/W) - dim(i7/IF) = dim{V/U}. 10.77. Let U and IF be subspaces of V. Show that the cosets of UnW in F can be obtained by inter- secting each of the cosets of t/ in F by each of the cosets of IF in V: V/{UnW) = {{v+U)n{v'+W): v,v' eV} 10.78. Let T:V ^V be linear with kernel IF and image U. Show that the quotient space V/W is isomorphic to U under the mapping e : V/W -> U defined by e{v -I- IF) = T{v). Furthermore, show that v T = io 0OTJ where i; : F -> V/W is the natural mapping of V into V/W, i.e. r,{v) = -y -t- IF, and t : U C V is the inclusion mapping, yjy^ i.e. i(u) = u. (See diagram.) CHAP. 10] CANONICAL FORMS 247 Answers to Supplementary Problems 10.41. (i) R2 and {0} (ii) C\ {0}, PTj = L((2, 1 - 2i)), W^ = L{{2, 1 + 2i)) • 10.52. (i) 2 1 2 2 1 I 2 I r 3 1 3 2 1 I 4_. L2_. ^2^ i_lL i 3 1 I 3 (ii) 7 1 i 7 I 7 1' ' 7 I L_i4-- I 7 7 1 ; 7 I 111. 7 I "[7- (iii) 2 1 I 2 I 2 1 2 1 H I 2 2 1 2 1 2 2 1 -I I 2 1 I 2 2 1 I 2 1 I 2 I r I 2 1 I 2 I 111- I 2 / 2 1 I 2 1 I 2 I 2 I TiL. (iv) 3 1 I Wi^ L. \ I 5 1 I 5 3 1 I 3 I r 3 1 I 1 5 1 I I 5 / 3 1 I 3 I r- [5 n ' 5 I |5 1 I 5 3 1 I 3 I L- - f- - 1 3 I L"-' _ r5""i"i ^-in_ I 5 / 248 10.60. (i) (ii) (iii) '0 -3 1 '0 0-1 10-3 1-3 1 2 -4 1 CANONICAL FORMS -3 1 -1 1 -2 -1 1 -2/ '0 0-1 1 0-3 1-3 -9 1 -6/ -1 1 -2 -1, 2 -4 1 [CHAP. 10 -3 1 -1 1 -2 -9 1 -6 -1 -1 1 -3 1 -3 -1 -1, 2 1 1 -4 1 1 -9 -6 -3. 10.61. (i) (ii) /O -1 y/^ -W (iii) V^ -Vsl 10.62. ^0 -x*\ 10 4\3 10 -6\2 \0 1 4X chapter 11 Linear Functionals and the Dual Space INTRODUCTION In this chapter we study linear mappings from a vector space V into its field K of scalars. (Unless otherwise stated or implied, we view X as a vector space over itself.) Naturally all the theorems and results for arbitrary linear mappings on V hold for this special case. However, we treat these mappings separately because of their fundamental importance and because the special relationship of 7 to Z gives rise to new notions and results which do not apply in the general case. LINEAR FUNCTIONALS AND THE DUAL SPACE Let F be a vector space over a field K. A mapping <i>:V -* K is termed a linear func- tional (or linear form) if, for every u,v GV and every a,b G K, 4,{au -^hv) — a 4,{u) + b <j>{v) In other words, a linear functional on F is a linear mapping from V into K. Example 11.1: Let jt-j : K" -» K be the ith projection mapping, i.e. 7rj(ai, aj, ...,a„) = a^ Then ttj is linear and so it is a linear functional on X". Example 11.2: Let V be the vector space of polynomials in t over R. Let ^ iV -^ R be the integral operator defined by ^{p(t)) = j p{t) dt. Recall that ^ is linear; and hence it is a linear functional on y. o Example 11.3: Let V be the vector space of n-square matrices over K. Let T iV ^ K be the trace mapping T{A) - «„ + a22 + • • • + «««. where A = (ay) That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.27) and so it is a linear functional on V. By Theorem 6.6, the set of linear functionals on a vector space V over a field K is also a vector space over K with addition and scalar multiplication defined by {(t> + (T){v) = (j>{v) + (t{v) and {!i<j>)iv) = k<j){v) where ^ and a are linear functionals on V and k G K. This space is called the diuil space of V and is denoted by V*. Example 11.4: Let V = K", the vector space of ti-tuples which we write as column vectors. Then the dual space V* can be identified with the space of row vectors. In particular, any linear functional <f> = {a^, . . . , a„) in y* has the representation <p{xi, . . .,Xn) = (tti.aa, . . .,a„) 249 250 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 or simply 0(a;i «„) = a^Xi + 02*2 + • • • + In^n Historically, the above formal expression was termed a linear form. DUAL BASIS Suppose y is a vector space of dimension n over K. By Theorem 6.7, the dimension of the dual space V* is also n (since K is of dimension 1 over itself.) In fact, each basis of V determines a basis of V* as follows: Theorem 11.1: Suppose {Vi, . . .,v„} is a basis of V over K. Let ^j, . . .,^„ G V* be the linear functionals defined by ^'(^^^ ^ ^« = jo if^^i Then {^i ^„} is a basis of V*. The above basis {<|>^) is termed the basis dtial to (Vi) or the dval basis. The above for- mula which uses the Kronecker delta Si, is a short way of writing 4>^(vJ = 1, <l>,{v^) = 0, <i>^(v^) = 0, . . ., .I>,ivj - ^^{v,) = 0, </,,(t;,) = 1, <l>,{v,) = 0, . . ., <l>,ivj - By Theorem 6.2, these linear mappings ^, are unique and well defined. Example Hii: Consider the following basis of R^: {v^ = (2,1), ■Uj - (3,1)}. Find the dual basis {^i> ^2}' We seek linear functionals ^i(a;, y) = ax + by and 02(*. y) = ex + dy such that ^i('i;i) = 1, 01(^2) = 0, 02(^1) = 0, 9*2(^2) = 1 Thus 0,(vi) = 0i(2, 1) = 2a + 6 = 11 ^^ a = -1, 6 = 3 0iK) = 0j(3,l) = 3a + 6 = Oj 02K) = 02(2,1) = 2c + d = Oj ^^ e = l,d = -i 02(^2) = 02(3, 1) = 3c + d = ij Hence the dual basis is {4>i(x, y) = -x + Sy, 4>2(.x, y) = x - 2y}. The next theorems give relationships between bases and their duals. Theorem 11.2: Let {vi, . . . , v™} be a basis of V and let {<l>^, ...,4>Jhe the dual basis of V*. Then for any vector uGV, u = 4,^{u)v^ + 4,^{u)v^ + • • • + <t>Su)v^ and, for any linear functional a € V*, Theorem 11.3: Let {vi, ...,Vn} and {wi, ...,Wn} be bases of V and let {<^,, .. .,<#>„} and {ffj, . . . , o-„} be the bases of 7* dual to {Vi} and {Wi} respectively. Suppose P is the transition matrix from [Vi) to {Wi}. Then (P-i)* is the transition matrix from {4>^} to (o-.j. CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 251 SECOND DUAL SPACE We repeat: every vector space V has a dual space F* which consists of all the linear functionals on V. Thus V* itself has a dual space V**, called the second dual of V, which consists of all the linear functionals on V*. We now show that each v GV determines a specific element v GV**. First of all, for any <j> GV* we define ^ vi<j>) = <l>iv) It remains to be shown that this map v.V* -^ K is linear. For any scalars a,b GK and any linear functionals ^, o- G V*, we have v(a(j> + ba) = (acf> + b(T)(v) — a <i>(v) + b a(v) - av{<j>) + bv{a) That is, V is linear and so v GV**. The following theorem applies. Theorem 11.4: If V has finite dimension, then the mapping v ^ v is an isomorphism of V onto V**. The above mapping v i^ t; is called the natural mapping of V into V**. We emphasize that this mapping is never onto F** if 7 is not finite-dimensional. However, it is always linear and, moreover, it is always one-to-one. Now suppose V does have finite dimension. By the above theorem the natural mapping determines an isomorphism between V and V**. Unless otherwise stated we shall identify V with V** by this mapping. Accordingly we shall view V as the space of linear functionals on V* and shall write V = V**. We remark that if {^J is the basis of V* dual to a basis {Vi} of V, then {vi} is the basis of V = V** which is dual to (^J. ANNIHILATORS Let W he a subset (not necessarily a subspace) of a vector space V. A linear functional ^gV* is called an annihilator of W if 4>{w) = for every w GW, i.e. if <j>{W) = {0}. We show that the set of all such mappings, denoted by W and called the annihilator of W, is a subspace of V*. Clearly G W^. Now suppose ^, <r G W^. Then, for any scalars a,b gK and for any w GW, (a<ji + b<j){w) = a^(w) + b (t{w) — aO + bO = Thus a^ + baG W and so W is a subspace of V*. In the case that IF is a subspace of F, we have the following relationship between W and its annihilator W. Theorem 11.5: Suppose F has finite dimension and IF is a subspace of F. Then (i) dim W + dim IF" = dim F and (ii) TF»» = W. Here W" = {vGV: ^(v) = for every ^ G W>} or, equivalently, IF"" = (TF")" where IF"" is viewed as a subspace of F under the identification of F and F**. The concept of an annihilator enables us to give another interpretation of a homogeneous system of linear equations, anXi + ai2X2 + • • • + ainXn — (*) amlXi + am2X2 + • • • + UmnXn — 252 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 Here each row {an, oa, . . ., (kn) of the coefficient matrix A = (an) is viewed as an element of K" and each solution vector ^ = {xi, X2, . . ., Xn) is viewed as an element of the dual space. In this context, the solution space S of (*) is the annihilator of the rows of A and hence of the row space of A. Consequently, using Theorem 11.5, we again obtain the following fundamental result on the dimension of the solution space of a homogeneous system of linear equations: dimS = dimK" — dim (row space of A) = n - rank (A) TRANSPOSE OF A LINEAR MAPPING Let T :V -^ U be an arbitrary linear mapping from a vector space V into a vector space U. Now for any linear functional ^ G U*, the composition ^ o T is a linear mapping from V into K: That is, (j)oT GV*. Thus the correspondence </> h» <f>oT is a mapping from U* into V*; we denote it by T' and call it the transpose of T. In other words, T*:TJ* -^ V* is defined by r'(0) = 4,oT Thus {T\4,)){v) = ^{T{v)) for every v &V. Theorem 11.6: The transpose mapping T' defined above is linear. Proof. For any scalars a,b G K and any linear functionals <^, a G f/*, T*{a<j> + ba-) = (a^ + 6tr)or = a{ci>oT) + b(aoT) = a T\4,) + h T*{a) That is, T' is linear as claimed. We emphasize that if T is a linear mapping from V into U, then T* is a linear mapping from U* into V*: ^ j,, The name "transpose" for the mapping T* no doubt derives from the following theorem. Theorem 11.7: Let T-.V-^V be linear, and let A be the matrix representation of T rel- ative to bases {Vi} of V and {Ui} of V. Then the transpose matrix A* is the matrix representation of T*:U*-* V* relative to the bases dual to {Mi} and {Vi}. CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 253 Solved Problems DUAL SPACES AND BASES 11.1. Let : R2 -* R and a : R^ -* R be the linear functionals defined by ^{x, y) = x + 2y and (t{x, y) - Sx- y. Find (i) ^ + a, (ii) 4^, (iii) 2^ - 5<r. (i) (^ + »)(«, J/) = i>(x,y) + a(x,y) = x + 2y + 3x - y = 4x + y (ii) (40)(aj,2/) = 4 0(a;,2/) = i(x + 2y) = ix + 8y (iii) {2<p~5a){x,y) = 2<p{,x,y) - ha(x,y) = 2(a; + 2j/) - 5(3x - j/) = -13a; + 9j/ 11.2. Consider the following basis of R^: {vi = (1,-1,3), Vi = (0,1,-1), Vs = (0,3,-2)}. Find the dual basis {^j, 4)^, ^g}. We seek linear functionals ^i(«, y, «) = Oi* + azv + aaz, 4>^(x, y, z) = h-^x + h^y + bgz, 03(3;, j/, z) = e^x + CjW + CgZ such that 0i(i'i) = 1 0i(v2) = 01(^3) = 02(i;i) = 02('y2) = 1 ?*2(''3) = 03(^1) = 03(^2) = 03(1^3) = 1 We And ^^ as follows: 4>\ko\) = 0i(l. -1, 3) = «! - a2 + 303 = 1 0l(^'2) = 0i(0, 1, -1) = fflj - tts = 01(^3) = 0i(0, 3, -2) = 3a2 - 2a3 = Solving the system of equations, we obtain a,^ = 1, 02 = 0, 03 = 0. Thus 0i(a;, V, 2) = «. We next find 02- 02(^1) = 02(1. -1. 3) = 61 - 62 + 363 = 02(^2) = 02(0,1,-1) = 62- 63 = 1 02(1^3) = 02(0,3,-2) = 362-263 = Solving the system, we obtain 61 = 7, h^— —2, 63 = —3. Hence 02(a', y, z) — 7x — 2y — 3«. Finally, we find 03: 03(^1) = 03(1, -1, 3) = Ci - C2 + 3C3 = 03('"2) = 03(0,1,-1) = C2- C3 = 03(^3) = 03(0,3,-2) = 3c2-2c3 = 1 Solving the system, we obtain Cj = —2, C2 = 1, Cg = 1. Thus 03(x, y, z) = —2x + y + z. 11.3. Let V be the vector space of polynomials over R of degree — 1, i.e. V = {a + bt: a,b GR}. Let ^j:F-»R and ^2 = ^"*'' be defined by <^i(/(i)) = S^'mdt and Um) == S^'fi*)^^ (We remark that <j>^ and ^^ are linear and so belong to the dual space V*.) Find the basis {vi, t;2} of V which is dual to {<^j, ^g}. Let Vi = a + bt and i^a = c + dt. By definition of the dual basis, 0i(^i) = 1, 02('yi) = and 0i(V2) = 0, 02(^2) = 1 Thus 0j(vj) = j (a+6t)df = a + ^b = 1 02(''i) = I (a + 6t) dt = 2a + 26 = or a = 2, 6 = -2 254 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 01(^2) = I {c + dt)dt - c + ^d - MV2) = ( {c + dt)dt - 2c + 2d - 1 or c = —1, d = 1 In other words, {2 — 2t, —^ + t} is the basis of V which is dual to {961, 02)- 11.4. Prove Theorem 11.1: Suppose {vi, . . .,Vn} isa basis of V over K. Let <l>^, . . . , <j>^ & V* be the linear functionals defined by fl if i = i S.(v.) = 8.. ^ i Then {^j, . . .,^„} is a basis of V*. We first show that {^i, . . .,<i6„} spans V*. Let be an arbitrary element of V*, and suppose Set <r — ki<f>i + • • ■ + fc„0„. Then a(Vi) = (fci0i + • • • + fe„0„)(lJi) = fei 01(1^1) + k2<p2{Vi) + ••• + fe„0„(Vl) = fci • 1 + ^2 • + • • • + &„ • = fci Similarly, for i = 2, . . . , w, <T(Di) = (fci0i + • ■ ■ + /i:„0„)(i;i) = A;i0i('!;i) + ••• + ki^iivi) + •■• + fe„0n(-i'i) = ^i Thus <p{Vi) = a{Vi) for i = 1, . ..,n. Since and a agree on the basis vectors, = cr = &101 + • • • + fe„0„. Accordingly, {0i 0„} spans V*. It remains to be shown that {0i, . . . , J is linearly independent. Suppose ai0i + a202 + • • • + a„0„ = Applying both sides to v^, we obtain = 0(vj) = {aj0i + • ■ • + a„0n)(i'i) = ai0i(l'i) + ttz 02(^1) + ■•• + O'nSi'nC'yi) = »! • 1 + a2 • + ■ • • + a„ • = tti Similarly, for i = 2, . . . , «, = ©(-Ui) = (ai0i ^ + a„^„)(Vi) = a-l 0l('yt) + • • • + »i S6i(^i) + • • • + «n S^nC^'i) = "i That is, «! = 0, . . .,a.„ = 0. Hence {0i, ...,</>n} is linearly independent and so it is a basis of V*. 11.5. Prove Theorem 11.2: Let {vi, . . . , v„} be a basis of V and let {<j>^, .. .,<f,Jhe the dual basis of V*. Then, for any vector uGV, and, for any linear functional a GV*, a = cr(i;>j + .7(1;,)^, + • • • + <7(i;J<^„ (2) Suppose M — aiVi + a^Vi + • • • + «„-«„ (5) Then 0i(m) = di 0i(i;i) + ^2 01(^2) + • ■ • + «« SilC'^n) = «! • 1 + O2 • + • • • + «■« • == «1 CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 255 Similarly, for i = 2, . . .,n, </>i{u) = »! ,j,i(vi) + •■■ + Ui <f>i{Vi) + •■■ + a„ 0i(i;„) = »; That is, ipiiu) = Oj, 02(w) = a2' • • •. 0nW = «n- Substituting these results into (3), we obtain (1). Next we prove (2). Applying the linear functional a to both sides of (1), a{u) = 0i(M)<r(vi) + ^aM^K) + ••• + </>n(u) <r(v„) = o{Vi) ^i(w) + o(V2) 02(m) + • • • + a{vj <f,„{u) = {<'{Vi)'f>l + ('(^2)02 + • • • + <r(t)„)0„)(M) Since the above holds for every m G V, a — a{vi)<f,i + <r(i^2)02 + • • • + "{vj^^ as claimed. 11.6. Prove Theorem 11.3: Let {vu...,Vn} and {wi,...,Wn} be bases of V and let {^1, • . . , ^„} and (<7j, . . . , CT„} be the bases of V* dual to {vi} and {Wt} respectively. Suppose P is the transition matrix from {Vi} to {Wi}. Then (P~»)' is the transition matrix from {(j>J to {<tJ. Suppose Wi = OuUi + ai2V2 + • • ■ + ai„i;„ <ri = ftn^i + 6i202 + • ■ ■ + 6i„0„ W2 = (121^1 + a22'«'2 + ■ ■ • + a2nVn 02 = 62101 + ^2202 + " " " + hn'f'n w„ = a„ii)i + a„2i'2 + • ■ • + a„„v„ a„ - b^i,pi + 6„202 + ' • • + 6„n0„ where P = («„) and Q = (6y). We seek to prove that Q = (P-i)«. Let Ri denote the tth row of Q and let Cj denote the ith column of P«. Then ■Rj = (6ti. 6i2. • • • . 6i„) and Cj = (dji, aj2, . . . , a^^Y By definition of the dual basis, <fi(Wj) = (6(101 + 6i202 + • • • + 6j„^„)(ajii)i + aj2V2 + ■ • • + aj„v„) = 6jiaji + 6j2aj2 + ■ • • + 6i„a.j„ = Rfij = Sjj where Sy is the Kronecker delta. Thus /KiCi iJ,C2 ... K„C„\ QPt = ^2^1 R2C2 • • ■ R2C„ - !"■'■ •■■"1=7 \RnCl Rvp2 ■ ■ ■ ^rflnl and hence Q = (P«)-i = (P-i)« as claimed. 11.7. Suppose V has finite dimension. Show that if v GV, v¥'0, then there exists <j>GV* such that <^(v) # 0. We extend {v} to a basis {i), ^2. • • • > ^n) of V- By Theorem 6.1, there exists a unique linear mapping <f>:V ^ K such that 0(1;) - 1 and ^(i;^) = 0, i = 2, . . . , n. Hence ^ has the desired property. 11.8. Prove Theorem 11.4: If V has finite dimension, then the mapping v ^ v is an isomorphism of V onto V**. (Here v : V* -* K is defined by v{<j>) = ^(v).) We first prove that the map v \-^ v is linear, i.e. for any vectors v,w eV and any scalars a,b & K, av + bw = av + bw. For any linear functional <fi G V*, av + bw (0) = <f,{av + bw) = a ^{v) + b (f>{w) = av(</,) + bw(4,) = (av + bw)(ip) Since av + bw (<i>) = (at) + 6M))(0) for every <p&V*, we have SaT+Vw = 0?+ 6w. Thus the At. map v 1-^ 1; is linear. 256 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 Now suppose V GV, v ¥= 0. Then, by the preceding problem, there exists <f> G V* for which <f,{v) # 0. Hence v (<j,) - <t,(v) # and thus v ¥^ Q. Since v # implies v¥=Q, the map v H- r is nonsingular and hence an isomorphism (Theorem 6.5). Now dim V = dim V* = dim V** because V has finite dimension. Accordingly, mapping v ^ v is an isomorphism of V onto V**. ANNIHILATORS 11.9. Show that if <j>GV* annihilates a subset S of V, then ^ annihilates the linear span L{S) of S. Hence S» = (L(S))». Suppose V e L{S). Then there exist Wj, . . .yW^G S for which v = a^w^ + a^w^ + • • • + a^w^. <f>{v) = Ui 0(Wi) + (12 0(W2) + ■ ■ ■ + «r ('('^r) = "l" + ^^20 + • • ' + afi = Since v was an arbitrary element of L(S), <f> annihilates L(S) as claimed. 11.10. Let W be the subspace of R* spanned by vi = (1, 2, -3, 4) and v^ = (0, 1, 4, -1). Find a basis of the annihilator of W. By the preceding problem, it suffices to find a basis of the set of linear functionals <f>{x, y, z, w) = ax + hy + CZ + dw for which 4>{vi) = and <p(v2) — 0: 0(1,2,-3,4) = a + 2& - 3c + 4d = 0(0,1,4,-1) = 6 + 4c-d = The system of equations in unknowns a, h, c, d is in echelon form with free variables c and d. Set c = 1, d = to obtain the solution o = 11, 6 = -4, c = 1, d = and hence the linear func- tional 0i(a;, y, z, w) = 11a; — 4j/ + z. Set c = 0, d = -1 to obtain the solution a = 6, 6 = -1, c = 0, d = -1 and hence the linear func- tional 02(«^. 2/. «. *") = &x ~ y — w. The set of linear functionals {0i, 02} is a basis of W>, the annihilator of W. 11.11. Show that: (i) for any subset S of V, S^S""; (ii) if S1CS2, then S^cS?. (i) Let V e S. Then for every linear functional e S*, v(0) = 0(v) = 0. Hence v G (SO)*. Therefore, under the identification of F and V**, v S S»o. Accordingly, S C S"*. (ii) Let 0GS2. Then (■!;) = for every ■« G Sg. But S1CS2; hence annihilates every ele- ment of Si, i.e. G Si. Therefore S^ cSj. 11.12. Prove Theorem 11.5: Suppose V has finite dimension and W is & subspace of V. Then (i) dim W + dim W = dim V and (ii) W^ = W. (i) Suppose dim V = w and dim W = r - n. We want to show that dim W'> = n-r. We choose a basis {wi, ...,w^}oiW and extend it to the following basis of V: {wi, . . .,w„ Vi, . . ., v„_r}- Consider the dual basis , , \01i • • • > 0r> "it • • • > <'ti-r/ By definition of the dual basis, each of the above a's annihilates each Wj-, hence „! ff„_, G W*. We claim that {ctj} is a basis of W'". Now {<tj} is part of a basis of V* and so it is linearly independent. We next show that {a^} spans W. Let <r G W<>. By Theorem 11.2, <7 - a(Wi)0i 4- • • • + <T{Wr)</,r + a(Vi)ai + ••■ + a{v^-r)<'n-r - 001 4- • • • + 00, -I- tr(Vi)(Ti + • • • + CT(v„_r)<r„-r = <f(''l)<'l + • • • + aiv^-rW-r Thus {ffi ffn-r) spans PF" and so it is a basis of W^. Accordingly, dim W^" = m — r = dim V — dim W as required. (ii) Suppose dimV^^w and diraW = r. Then dim F* = m and, by (i), dimTF<' = TO-r. Thus by (i), dim TVO" = n- (n-r) = r; therefore dim W = dim W". By the preceding problem, W C TFOO. Accordingly, W = WO". CHAP. 11] LINEAR PUNCTIONALS AND THE DUAL SPACE 257 11.13. Let U and W be subspaces of V. Prove: (C7 + Wf = U° n W^. Let 0G {U+W)0. Then annihilates U+W and so, in particular, ,p annihilates U and y. That is, e t/o and G TF"; hence £ C/o n PFO. Thus {U + W)" C U" n W. On the other hand, suppose a e W n W. Then a annihilates U and also W. If ve^U+W, then -y = M + w where m £ [7 and w e W. Hence ctCv) = (r{u) + a{w) = + = 0. Thus a annihilates U+W, i.e. a e. (U + TF)". Accordingly, U'>+W'>c(U+ W)'>. Both inclusion relations give us the desired equality. Remark: Observe that no dimension argument is employed in the proof; hence the result holds for spaces of finite or infinite dimension. TRANSPOSE OF A LINEAR MAPPING 11.14. Let <j> be the linear functional on R^ defined by ^(a;, y) - x - 2y. For each of the following linear operators T on R^, find iT%4,)){x, y): (i) T{x, y) = {x, 0); (ii) T{x,y) = (y,x + y); (iii) T{x,y) = (2x~Zy,hx + 2y). By definition of the transpose mapping, rf(0) = .pof, i.e. (Tt{4,)){v) = <p{T(v)) for every vector V. Hence (i) {Tt{,p)){x, y) = </,(T{x,y)) = 0(x, 0) = x (ii) (rt(0))(x, 2/) = 4>{T{x,y)) = <f>{y,x + y) = y - 2{x + y) = -2x - y (iii) {Tt(<f,))(x, y) = <f.(T(x, y)) = ,f,(2x - 3y, 5x + 2y) = i2x - 3y) - 2i5x + 2y) = ~8x-ly. 11.15. Let T-.V-^U be linear and let T*:U*^ V* be its transpose. Show that the kernel of T' is the annihilator of the image of T, i.e. Ker T* = (Im T)". Suppose e Ker T*; that is, r'(0) = o y = o. If m G Im T, then m = T(v) for some •y G V; hence 0(m) = 0(7(1;)) = (4>°T){v) = 0{v) = We have that <p{u) = for every m G Im T; hence G (Im 2^». Thus Ker T* C (Im r)*. On the other hand, suppose o G (Im T)"; that is, ^(Im T) = {0}. Then, for every v &V, (THamv) = {aoT)(v) = a(T{v)) = = 0(v) We have that (r'(a))('y) = Q(v) for every ■!; G Y; hence T*(a) = 0. Therefore a S Ker T* and so (Im r)o c Ker TK Both inclusion relations give us the required equality. 11.16. Suppose V and U have finite dimension and suppose T:V ^ U is linear. Prove: rank(r) = rank(rO- Suppose dim V = n and dim U = m. Also suppose rank (T) = r. Then, by Theorem 11.5, dim ((Im T)0) = dim 17 - dim (Im T) = m - rank (T) = m - r By the preceding problem, Ker Tt = (Im T)'>. Hence nullity (T') = m — r. It then follows that, as claimed, rank(r«) = dim C/* - nullity (T') = m ~ (m-r) = r = rank (T) 11.17. Prove Theorem 11.7: Let T:V -^ U be linear and let A be the matrix representation of T relative to bases {vi, . . ., Vm} of V and {ui, . . . , it„} of U. Then the transpose matrix A* is the matrix representation of T:JJ* ^ V* relative to the bases dual to [Ui] and {Vj}. Suppose ^(^i) = a-iiU-i + a^^u-i + • ■ • + ain^n T{V2) = a2iUi + a22M2 + • • • + a2nU„ .^. 258 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 We want to prove that r'(o'2) = «1201 + <*2202 + • • ■ + ttm2'^m {2) r'(of„) — a-in't'l + '''•2ti02 + • • • + 0'mn</>m where {(tJ and {^j} are the bases dual to {mJ and {vj} respectively. Let V e.V and suppose v = k^v^ + fejVa + • • ■ + fc„v„. Then, by (1), T(v) = fci r(-i;i) + ^2 r(i'2) + • • ■ + km TivJ = &i(aiiMi + ■ • • + aj„M„) + fc2(a2lMl + • • • + «2n**n) + • • • + fcm(a'mlMl + ■ • ' + C-mn'^'n) = (fciaii + fc2a21 + • • • + fcmaml)^! + ' • • + (fcitlln + fc2*2n + ' " ' + k„a'mn)'>^n n = 2 (fciOii + fc2«2i + • • • + kmOmdUi i=l Hence for ;/ = 1, . . . , n, (Tt(aj)(v)) = ffjCrCv)) = 0-j ( 2 (feiaii + fc2(l2i + 1- fcm«mi)«t ) = fciOij + fe2a.2j + • • ■ + k^amj (^) On the other hand, for j = 1, . . .,n, (aij0i + a2j02 + • • • + a^j^J(v) = (ajj^i + a2j4'2 +■■■+ amj0m)(fei'^i + A:2'U2 + • " ' + km^m) = k^aij + k2a2j + • • • + k^a^j {i) Since v e.V was arbitrary, (3) and (4) imply that r'(CTj) = aij0i + a2j*2 + • • • + a.mj't'm' i = 1, . . ., m which is (2). Thus the theorem is proved. 11.18. Let A be an arbitrary mxn matrix over a field K. Prove that the row rank and the column rank of A are equal. Let T:K«-> K'n be the linear map defined by T{v) = Av, where the elements of X" and K^ are written as column vectors. Then A is the matrix representation of T relative to the usual bases of if" and K", and the image of T is the column space of A. Hence rank (T) = column rank of A By Theorem 11.7, A« is the matrix representation of T* relative to the dual bases. Hence rank (T') = column rank of A* = row rank of A But by Problem 11.16, rank(r) = rank(r'); hence the row rank and the column rank of A are equal. (This result was stated earlier as Theorem 5.9, page 90, and was proved in a direct way in Problem 5.21.) Supplementary Problems DUAL SPACES AND DUAL BASES 11.19. Let : R3 -» R and <t : R3 -> R be the linear functionals defined by ^(x, y,z) = 2x-By + z and a(x, y, z) = Ax-2y + 3z. Find (i) <!> + <', (ii) 3^, (iii) 20 - 5<r. 11.20. Let be the linear functional on R2 defined by 0(2,1) = 15 and 0(1,-2) = -10. Find ^(x,y) and, in particular, find 0(— 2, 7). 11.21. Find the dual basis of each of the following bases of R^: (i) {(1, 0, 0), (0, 1, 0), (0, 0, 1)}, (ii) {(1, -2, 3), (1, -1, 1), (2, -4, 7)}. CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 259 11.22. Let V be the vector space of polynomials over B of degree - 2. Let 0i, <f>2 and ^3 be the linear functionals on V defined by Him) = f mat, <j>^{m) = m), .p^mm = m) Here f{t) = a+bt + ct^eV and /'(f) denotes the derivative of f(t). Find the basis {/i(«), fzd), ^(t)} of V which is dual to {^i, <f>2, 03}. 11.23. Suppose u,vGV and that <f>{u) = implies <p{v) = for all <f, G V*. Show that v = ku for some scalar k. 11.24. Suppose 0,ffGy* and that <f>[v) = implies <t(v) = for all vGV. Show that a = k^, for some scalar k. 11.25. Let V be the vector space of polynomials over K. For a^K, define <pa'-V ^ K by 0a(/(*)) = /('')• Show that: (i) 0„ is linear; (ii) if a ¥= b, then 0„ 7^ 0b- 11.26. Let V be the vector space of polynomials of degree — 2. Let a,b,c€.K be distinct scalars. Let 0a, 0b and 0e be the linear functionals defined by 0a (/(«)) = /(a), 0b (/(«)) = /(6), 0c (/(*)) = /(c). Show that {0a, 0b, 0c} is linearly independent, and find the basis {/i(t), /2(t), fait)} of V which is its dual. 11.27. Let V be the vector space of square matrices of order n. Let T iV ^ K be the trace mapping: T{A) = ail + «22 + ■ ■ ■ + "rm> where A = (o.^). Show that T is linear. 11.28. Let W he a, subspace of V. For any linear functional on W, show that there is a linear functional a on V such that a(w) = <p(w) for any w &W, i.e. is the restriction of a to TF. 11.29. Let {ci, . . . , e„} be the usual basis of K". Show that the dual basis is Wi, . . ., ir„} where vi is the •ith projection mapping: viia^, . . .,a„) = a.;. 11.30. Let y be a vector space over B. Let 0j, 02 S V* and suppose <r : y -> B defined by a(v) = 01(1;) 02('y) also belongs to V*. Show that either 0i = or 02 = 0. ANNIHILATORS 11.31. Let W be the subspace of B* spanned by (1,2,-3,4), (1,3,-2,6) and (1,4,-1,8). Find a basis of the annihilator of W. 11.32. Let W be the subspace of B^ spanned by (1, 1,0) and (0, 1, 1). Find a basis of the annihilator of W. 11.33. Show that, for any subset S of V, L{S) = S"" where L{S) is the linear span of S. 11.34. Let U and W be subspaces of a vector space V of finite dimension. Prove: {U n W)" = U" + W. 11.35. Suppose V - U @ W. Prove that V* = W ® WO. TRANSPOSE OF A LINEAR MAPPING 11.36. Let be the linear functional on B^ defined by <l>(x,y) = Zx — 2y. For each linear mapping r : B3 ^ B2, find (rK0))(a;, y, z): (i) T{x,y,z) = (x + y,y + z); (ii) T(x,y,z) = (x + y + z,2x-y). 11.37. Suppose S:U-^V and TiV^W are linear. Prove that (ToS)t = StoTK 11.38. Suppose T:V -*U is linear and V has finite dimension. Prove that Im T' = (Ker T)". 11.39. Suppose r : y -» [7 is linear and u G U. Prove that u GlmT or there exists •/> GV* such that rt(0) = and 0(m) = 1. 11.40. Let y be of finite dimension. Show that the mapping T h> Tt is an isomorphism from Hom (V, V) onto Hom {V*, V*). (Here T is any linear operator on V.) 260 LINEAR PUNCTIONALS AND THE DUAL SPACE [CHAP. 11 MISCELLANEOUS PROBLEMS 11.41. Let y be a vector space over R. The line segment uv joining points m, v S V is defined by wv — {tu + (1 — t)v: a — t — 1). A subset S of y is termed convex if u,v GS implies uvcS. Let <6 e y* and let W+ = {vGV: 4,(v) > 0}, W = {vGV: <t>(v) = 0}, W" = {vi=,V: <t>(v) < 0} Prove that W + , W and W~ are convex. 11.42. Let y be a vector space of finite dimension. A hyperplane i/ of y is defined to be the kernel of a nonzero linear functional </> on V. Show that every subspace of V is the intersection of a finite number of hyperplanes. Answers to Supplementary Problems 11.19. (i) 6x-5y + 4z, (ii) ex-9y + Sz, (iii) -16x + iy - ISz 11.20. <p(x, y) = Ax + ly, 0(-2, 7) = 41 11.21. (i) {^i{x,y,z) = X, <t,2(x,y,z) - y, <t>z{x,y,z) = z} (ii) {>t,i(x, y, z) = -3x -hy- 2z, ^ii'^, y, z) = 2x + y, </>3(x, y,z) = x + 2y + z] 11.25. (ii) Let f{t) = t. Then ^a(/(*)) = a ^ 6 = 0b(/(O). and therefore ^„ # 0b- 11.26. |/i(«) - („_6)(„_e) . h(t) - (6_„)(5_^) . /3W (^_„)(,_6) I 11.31. {0i(a;, 2/, z, t) = 5x - 2/ + 2, 02(«. 1/. «. *) = 22/ - t} 11.32. {<j>(x, y,z) = x-y + z) 11.36. (i) (Tt(<t>))(x,y,z)=Zx + y-2z, (ii) (r'(0))(a;,2/,z) = -x + Sj/ + 3x. chapter 12 Bilinear, Quadratic and Hermitian Forms BILINEAR FORMS Let F be a vector space of finite dimension over a field K. A bilinear form on V is a mapping f.VxV^K which satisfies (i) f{aui + bu2, v) = af{ui, v) + bf{u2, v) (ii) f{u, avi + bvi) = af{u, Vi) + bf{u, Vi) for all a,b&K and all im, Vi e V. We express condition (i) by saying / is linear in the first variable, and condition (ii) by saying / is linear in the second variable. Example 12.1: Example 12.2: Example 12.3: Let <j> and a be arbitrary linear f unctionals on V. Let / : V X y -» K be defined by f(u,v) = 4>(u)<t(v). Then / is bilinear because (p and a are each linear. (Such a bilinear form / turns out to be the "tensor product" of <f> and <t and so is sometimes written / = ^ (g) a.) Let / be the dot product on R"; that is, f{u, v) = u'v = a^bi + a^h^ + • • • + a„6„ where u = (a^) and v = (h^). Then / is a bilinear form on R". Let A = (ffly) be any nXn matrix over K. Then A may be viewed as a bilinear form / on X" by defining ' an ai2 ... a.i„ \ I Vi a^i 0.22 • ■ ■ "2n I I 2/2 f(X,Y) = XtAY = {xi,X2 a; J *nl °'n2 «ll«l2/l + «12»'l2/2 + = 2 U- ^ij^iVj Vnl The above formal expression in variables Xi,yi is termed the bilinear polynomial corresponding to the matrix A. Formula (i) below shows that, in a certain sense, every bilinear form is of this type. We will let B{V) denote the set of bilinear forms on V. A vector space structure is placed on B(V) by defining f + g and kf by: {f + 9){u,v) = f{u,v) + g{u,v) {kf){u,v) = kf{u,v) for any f,g & B{V) and any kE^K. In fact, Theorem 12.1: Let F be a vector space of dimension n over K. Let {4,^, ...,</>„} be a basis of the dual space V*. Then {fij:i,j = 1,. • .,w} is a basis of B{V) where fa is defined by fi}{u,v) = <j>^{u) <i>.{v). Thus, in particular, dimB{V) = w'. 261 262 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 BILINEAR FORMS AND MATRICES Let / be a bilinear form on V, and let {ei, . . . , e^} be a basis of V. Suppose u,v eV and suppose u = aiCi + • • • + a„e„, v = biCi + • • • + bnen Then f{u, v) — f{aiei + • • • + anCn, biCi + • • • + bne„) n = ai&i/(ei, ei) + 0162/(61,62) + ••• + a„b„/(en, e^) = ^ aib}f{ei,ej) i,3 = 1 Thus / is completely determined by the n^ values /(ei, e,). The matrix A = {an) where an — f{ei, e,) is called the matrix representation of f rel- ative to the basis {ei} or, simply, the matrix of f in {ei}. It "represents" / in the sense that '&i^ f{u,v) = ^ aibjfiei, ej) = (ai, . . . , a„)A | ^| = Me^Me (1) ^bnj for all u,v GV. (As usual, [u]e denotes the coordinate (column) vector of u GV in the basis {ei}.) We next ask, how does a matrix representing a bilinear form transform when a new basis is selected ? The answer is given in the following theorem. (Recall Theorem 7.4 that the transition matrix P from one basis {e^} to another {el} has the property that [u]e = P[u]e' for every u G V.) Theorem 12.2: Let P be the transition matrix from one basis to another. If A is the matrix of / in the original basis, then B = P'AP is the matrix of / in the new basis. The above theorem motivates the following definition. Definition: A matrix B is said to be congruent to a matrix A if there exists an invertible (or: nonsingular) matrix P such that B — P^AP. Thus by the above theorem matrices representing the same bilinear form are congruent. We remark that congruent matrices have the same rank because P and P' are nonsingular; hence the following definition is well defined. Definition: The rank of a bilinear form / on V, written rank(/), is defined to be the rank of any matrix representation. We say that / is degenerate or nondegenerate according as to whether rank (/) < dim V or rank (/) = dim V. ALTERNATING BILINEAR FORMS A bilinear form / on 7 is said to be alternating if (i) f(v,v) = for every v GV. If / is alternating, then = f{u + v,u + v) = f{u, u) + f{u, v) + f{v, u) + f{v, v) and so (ii) f{u,v) = -f{v,u) CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 263 for every u,v GV. A bilinear form which satisfies condition (ii) is said to be skew sym- metric (or: anti-symmetric). If 1 + 1 v^ in K, then condition (ii) implies f{v, v) = — f{v, v) which implies condition (i). In other words, alternating and skew symmetric are equivalent when 1 + 1 9^ 0. The main structure theorem of alternating bilinear forms follows. Theorem 12.3: Let / be an alternating bilinear form on V. Then there exists a basis of V in which / is represented by a matrix of the form Oil -1_04 ^ I l1 LzLjPJ, I 1 j 1-1 I -T-«4 lOj ro- Moreover, the number of ( ,, ] is uniquely determined by / (because it is equal to ^ rank (/)). -1 In particular, the above theorem shows that an alternating bilinear form must have even rank. SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS A bilinear form / on V is said to be symmetric if f{u,v) = f{v,u) for every u,v €.V. If A is a matrix representation of /, we can write f{X,Y) = X'AY = {X'AYY = Y'A'^X (We use the fact that X*AY is a scalar and therefore equals its transpose.) Thus if / is symmetric, Y*A*X = f(X,Y) = f{Y,X) =. Y'AX and since this is true for all vectors X, Y it follows that A = A* or A is sjTnmetric. Con- versely if A is symmetric, then / is symmetric. The main result for symmetric bilinear forms is given in Theorem 12.4: Let / be a symmetric bilinear form on V over K (in which 1 + 1^0). Then V has a basis {vi, . . .,v„} in which / is represented by a diagonal matrix, i.e. f{vi, Vj) = for i ¥- j. Alternate Form of Theorem 12.4: Let A be a symmetric matrix over X (in which 1 + 1 ^^^O). Then there exists an invertible (or: nonsingular) matrix P such that P*AP is diagonal. That is, A is congruent to a diagonal matrix. 264 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 Since an invertible matrix P is a product of elementary matrices (Problem 3.36), one way of obtaining the diagonal form P*AP is by a sequence of elementary row operations and the same sequence of elementary column operations. These same elementary row opera- tions on / will yield PK This method is illustrated in the next example. Example 12.4: Let A - matrix (A, I) 1 2-3 2 5 — 4 |, a symmetric matrix. It is convenient to form the block -3 -4 8 (A.I) 1 2 -3 1 2 5 -4 1 3 -4 8 1 We apply the operations R^ -» — 2/Ji + R^ and R^ -^ SR^ + R3 to {A, I), and then the corresponding operations C^ -^ — 2Ci + C^ and Cg -* 3Ci + C3 to A to obtain 1 2 -3 1 1 2 -2 1 2 -1 3 1 and then We next apply the operation ^3 -> —2R2 + R3 and then the corresponding operation C3 -♦ — 2C2 + C3 to obtain /l 1 1 2 -2 1 -5 7 -2 and then 1 1 1 0! -2 1 -5 1 7 -2 Now A has been diagonalized. We set P - and then Pt A P Definition: A mapping q:V-*K is called a quadratic form if q{v) = f{v,v) for some symmetric bilinear form / on V. We call q the quadratic form associated with the symmetric bilinear form /. If 1 + 1 7^ in if, then / is obtainable from q according to the identity f(u,v) = Uq{u + v) - q{u) - q{v)) The above formula is called the polar form of /. Now if / is represented by a symmetric matrix A = (an), then q is represented in the form (an ai2 ... ain\ /xi\ a21 ^22 ... azn \l X2 ... ttnl a„2 ... annj \Xnj = y ttiiXiXj - aiiccf + 022X2 + • • • + annxl +22 atiXiXj u *<^ The above formal expression in variables Xi is termed the quadratic polynomial correspond- ing to the symmetric matrix A. Observe that if the matrix A is diagonal, then q has the diagonal representation q{X) - X*AX - anxl + a22xl + ••• + annxl that is, the quadratic polynomial representing q will contain no "cross product" terms. By Theorem 12.4, every quadratic form has such a representation (when 1 + 1^0). CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 265 Example 12.5: Consider the following quadratic form on R^: <l(^> v) = 2x2 _ \2,xy + 5j/2 One way of diagonalizing q is by the method known as "completing the square" which is fully described in Problem 12.35. In this case, we make the substitution X = s + Zt, y = t to obtain the diagonal form q(x, y) = 2(s + 3t)2 - 12(s + U)t + 5(2 - 2s^ - 13*2 REAL SYMMETRIC BILINEAR FORMS. LAW OF INERTIA In this section we treat symmetric bilinear forms and quadratic forms on vector spaces over the real field R. These forms appear in many branches of mathematics and physics. The special nature of R permits an independent theory. The main result follows. Theorem 12.5: Let / be a symmetric bilinear form on V over R. Then there is a basis of V in which / is represented by a diagonal matrix; every other diagonal representation has the same number P of positive entries and the same number N of negative entries. The difference S = P — N is called the signature of /. A real symmetric bilinear form / is said to be nonnegative semidefinite if q{v) = f{v,v) ^ for every vector v; and is said to be positive definite if q{v) = f{v,v) > for every vector v ¥=0. By the above theorem, (i) / is nonnegative semidefinite if and only if S — rank (/) (ii) / is positive definite if and only if S = dim V where <S is the signature of /. Example 12.6: Let / be the dot product on R"; that is, f(u, v) = w V = eii6i + a2&2 + • • " + "n^n where u = (ttj) and v — (6j). Note that / is symmetric since f(u, v) = W V = vu = f{v, u) Furthermore, / is positive definite because f(u, u) = a^+ al+ ■■■ + al > when M v^ 0. In the next chapter we will see how a real quadratic form q transforms when the transi- tion matrix P is "orthogonal". If no condition is placed on P, then q can be represented in diagonal form with only I's and — I's as nonzero coefficients. Specifically, Corollary 12.6: Any real quadratic form q has a unique representation in the form qytVif . . . , iXyn) — vCi I ' * * ~T' i^g <jO 2 s + 1 The above result for real quadratic forms is sometimes referred to as the Law of Inertia or Sylvester's Theorem. 266 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 HERMITIAN FORMS Let y be a vector space of finite dimension over the complex field C. Let / : V xV -* C be such that (i) f{aui + bu2,v) = af{ui,v) + bf{u2,v) (ii) f{u,v) ^ fiv,u) where a, 6 G C and im, v &V. Then / is called a Hermitian form on V. (As usual, k denotes the complex conjugate of A; G C.) By (i) and (ii), f{u, avi + bv2) = f{avi + bv2, u) — af{vi, u) + b f{v2, u) = df{vi,u) + bf{v2,u) = df{u,Vi) + bf{u,V2) That is, (iii) f{u, avi + bv2) - a f{u, Vi) + b f{u, Vi) As before, we express condition (i) by saying / is linear in the first variable. On the other hand, we express cond ition ( iii) by saying / is conjugate linear in the second variable. Note that, by (ii), f{v, v) = f{v, v) and so f{v, v) is real for every v GV. Example 12.7: Let A = (dy) be an Ji X w matrix over C. We write A for the matrix obtained by taking the complex conjugate of every entry of A, that is, A = (a^). We also write A* for A« = AJ. The matrix A is said to be Hermitian if A* = A, i.e. if ay = a]^. If A is Hermitian, then f{X, Y) ^X'^AY defines a Hermitian form on C" (Prob- lem 12.16). The mapping q:V->-B, defined by q{v) = f{v, v) is called the Hermitian quadratic form or complex quadratic form associated with the Hermitian form /. We can obtain / from q according to the following identity called the polar form of /: f{u, v) = l{q{u + v)- q{u - v)) + j(q{u + iv) - q{u - iv)) Now suppose {ei, . . . , e„} is a basis of V. The matrix H = (fo«) wher e feij = /(ei, e,) is called the matrix representation of / in the basis {ej}. By (ii), f{ei,ej) = f(ej,ei); hence H is Hermitian and, in particular, the diagonal entries of H are real. Thus any diagonal rep- resentation of / contains only real entries. The next theorem is the complex analog of Theorem 12.5 on real symmetric bilinear forms. Theorem 12.7: Let / be a Hermitian form on V. Then there exists a basis {ei, . . ., en} of V in which / is represented by a diagonal matrix, i.e. f{ei, ei) = for i ¥= j. Moreover, every diagonal representation of / has the same number P of positive entries, and the same number N of negative entries. The difference S = P-N is called the signature of /. Analogously, a Hermitian form / is said to be nonnegative semidefinite if q{v) = f{v,v) ^ for every v eV, and is said to be positive definite if q{v) = f{v,v) > for every v ¥=0. Example 12.8: Let / be the dot product on C"; that is, f(u, V) — U'V = ZiWi + Z2W2 + ■ ■ • + z„w„ where u = («{) and v = (Wj). Then / is a Hermitian form on C". Moreover, / is positive definite since, for any v # 0, f(u, u) = zi«i + Z2,Z2+ •■■ + z„z„ = l^iP + I22P + • • • + K\^ > CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 267 Solved Problems BILINEAR FORMS 12.1. Let u = {xi, X2, xs) and v = (t/i, 1/2, ya), and let f{u, v) = Sxiyi — 2xiyi + 5x22/1 + 7*21/2 - 8a;2i/3 + 4x33/2 - Xzys Express / in matrix notation. Let A be the 3X3 matrix whose i;-entry is the coefficient of XiVy Then /3 -2 0\/j/i^ f{u,v) = XtAY = (^i.xa.ajg) 5 7-8 : \0 4 -l/\2/3i 12.2. Let A be an « X « matrix over K. Show that the following mapping / is a bilinear form on K": f{X,Y) = X*AY. For any a,bGK and any Xj, Fj e K", /(aATi + 6Z2, F) = (aZi + 6X2)*^ Y = (aXj + 6X^) .4 F = aXlAY + bXlAY = a/(Zi, F) + 6/(X2, F) Hence / is linear in the first variable. Also, /(X, aFi + ftFa) = XtA(aFi + 6F2) = aX^AYi + bX^AY^ = a /(X, Fj) + 6 /(X, F2) Hence / is linear in the second variable, and so / is a bilinear form on K". 12.3. Let / be the bilinear form on R^ defined by f{{xu X2), (yi, 2/2)) = 2xi2/i - 3xi2/2 + X22/2 (i) Find the matrix A of / in the basis {Ui — (1,0), 112 = (1, 1)}. (ii) Find the matrix B of / in the basis {vi = (2, 1), V2 = (1, -1)}. (iii) Find the transition matrix P from the basis {mi} to the basis {Vi}, and verify that B^P*AP. (i) Set A = (ay) where ay = /(ttj, Uj): an = f(ui,ui) = /((1,0), (1,0)) =2-0 + 0= 2 ai2 = /(mi,M2) = /((1,0), (1,1)) = 2-3 + = -1 021 = /(M2.M1) = /((1,1), (1,0)) = 2-0 + 0= 2 022 = /(M2.W2) = /((l.l), (1,1)) = 2-3 + 1 = =(r;)' Thus A = ( ft / *^ *^® matrix of / in the basis {u^, n^}. (ii) Set B = (6y) where 6y = /(Vj,-!;,): 611 = /(^i.i'i) = /((2,1), (2,1)) = 8-6 + 1 = 3 612 = /(^i.'y2) = /((2.1), (1,-1)) = 4 + 6-1 = 9 621 = f(->'2,'>'i) = /((l.-l), (2,1)) = 4-3-1 = 622 = /(^2.^2) = /((I, -1), (1,-1)) = 2 + 3 + 1 = 6 /3 9\ Thus B = I ft / 's ^^^ matrix of / in the basis {vi, v^}. (iii) We must write Vi and V2 in terms of the Mj: ri = (2,1) = (1,0) + (1,1) = M1 + M2 V2 = (1,-1) = 2(1,0) -(1,1) = 2Mi-M2 268 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 Then ^ = ( J _j) and so P' = Q _M . Thus 12.4. Prove Theorem 12.1: Let F be a vector space of dimension n over K. Let {^j, . . . , ^„} be a basis of the dual space V*. Then {/«: ij" = 1,. . .,%} is a basis of B{V) where /« is defined by f^.{u,v) = ^.(m)^.(v). Thus, in particular, dimB(F) = n\ Let {e^, . . ., e„} be the basis of V dual to {^J. We first show that {/„} spans B(y). Let / G B(y) and suppose /(ej, e^) = ay. We claim that / = ^ay/y. It suffices to show that /(e^.e,) = (2aij/ij)(es, ej) for s,t = l,...,n. We have (2 ay/y)(es, et) = 2ay/«(es, ej) = 2 Oy ^i(es) ^^(et) = 2ay«isSjt = "st = f(es,et) as required. Hence {/y} spans B{V). It remains to show that {/y} is linearly independent. Suppose Soy/y = 0. Then for s,t = 1,. . .,n, = 0(es, et) = (2 ao/y)(es, Bf) = a^s The last step follows as above. Thus {/y} is independent and hence is a basis of B(V). 12.5. Let [/] denote the matrix representation of a bilinear form / on F relative to a basis {ei, . . .,e„) of V. Show that the mapping / i-» [/] is an isomorphism of B{V) onto the vector space of w-square matrices. Since / is completely determined by the scalars f(e^, ej), the mapping /*"*[/] is one-to-one and onto. It suffices to show that the mapping / l-> [/] is a homomorphism; that is, that [af+bg] = a[f] + b[g] (*) However, for i,j = 1,. . .,n, (af + bg){ei, e^) = «/(«{, e^) + bg(ei, Bj) which is a restatement of (*). Thus the result is proved. 12.6. Prove Theorem 12.2: Let P be the transition matrix from one basis {e,} to another basis {gj}. If A is the matrix of / in the original basis {Ci}, then B = P'^AP is the matrix of / in the new basis {e\}. Let u,v eV. Since P is the transition matrix from {e^} to {e,-}, we have P[u]g, — [u]g and P[v]e- = [v]e- hence [u]l = [u]l, PK Thus f{u,v) = [u]lA[v], = [u]l.PtAP[v],. Since u and v are arbitrary elements of V, P^ A P is the matrix of / in the basis {e^}. SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS 12.7. Find the symmetric matrix which corresponds to each of the following quadratic polynomials: (i) q{x, y) = 4x^ - 6xy - 7y^ (iii) q{x, y, z) = 3x^ + 4xy - y' + 8xz — Qyz + z^ (ii) q{x, y) - xy + y^ (iv) q(x, y, z) = x^ - 2yz + xz The symmetric matrix A — (uy) representing q(xi, . . .,x„) has the diagonal entry a^ equal to the coefficient of xf and has the entries ay and ajj each equal to half the coefficient of ajjajj-. Thus CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 269 (ii) 12.8. For each of the following real symmetric matrices A, find a nonsingular matrix P such that P* AP is diagonal and also find its signature: (i) A - -: (ii) A = (i) First form the block matrix (A, I): {A, I) = - Apply the row operations R2^^Ri + R2 and iZa -» — 2i2i + iJj to (A,/) and then the corre- sponding column operations C^ -> SCj + C^, and C3 -» — 2Ci + C3 to A to obtain 1 -3 2 1 0\ /I 1 -2 1 3 1 and then -2 1 3 1 1 4 -2 1/ \o 1 4 -2 1 Next apply the row operation B3 -» i?2 + ^^3 ^^^ then the corresponding column operation C3-* C2 + 2C3 to obtain and then then P^AP - Now A has been diagonalized. Set P The signature S of A is 5 = 2-1 = 1 (ii) First form the block matrix (A, I): (A,/) = In order to bring the nonzero diagonal entry —1 into the first diagonal position, apply the row operation Ri <-> R^ and then the corresponding column operation Ci «-> C3 to obtain 1 1 1 1 -2 2 1 1 2 -1 1 1 2 -1 1\ -1 2 1 1 1 -2 2 1 and then 2 -2 1 1 1 1 1 0/ \ 1 1 1 Apply the row operations i?2 ~* 2Bi + -^2 ^nd JB3 -> iJj + R^ and then the corresponding column operations C^ -* 2Ci + C^ and C3 ^ Ci + C3 to obtain 1 2 1 1\ /-I 1 2 3 1 2 and then 2 3 1 2 3 1 1 1/ \ 3 1 1 1 270 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 Apply the row operation iJg -> —3^2 + 2R3 and then the corresponding column operation C3 -» -3C2 + 2C3 to obtain /-I l\ /-I 1^ 2 3 12 and then 2 12 \ -7 2-3-4/ \ -14 2 -3 -4, /O 2\ Now A has been diagonalized. Set P = 1 -3 ; then P'AP = \l 2 -4/ The signature S of 4 is the difference S = 1 — 2 = —1. 12.9. ' Suppose 1 + 1 v^ in K. Give a formal algorithm to diagonalize (under congruence) a symmetric matrix A = (an) over K. Case I: aii¥=0. Apply the row operations fij -> — ajxi?x + OxxiJj, i = 2, . . .,n, and then the /ill 0" corresponding column operations Cj -* — Oji Cj + an C; to reduce A to the form ( ^0 ^ Case II: a^ = but a^ ^ 0, for some i > 1. Apply the row operation Ri «^i2j and then the corresponding column operation Cj <-> Q to bring ctjj into the first diagonal position. This reduces the matrix to Case I. Case III: All diagonal entries Oji = 0. Choose i, j such that ay ¥= 0, and apply the row opera- tion i?i -> Rj + Rf and the corresponding column operation Ci-* Cj + Cj to bring 2ay # into the ith diagonal position. This reduces the matrix to Case II. /ail 0\ In each of the cases, we can finally reduce A to the form ( . _ ) where B is a symmetric matrix of order less than A. By induction we can finally bring A into diagonal form. Remark: The hypothesis that 1 + 1 #^ in K, is used in Case III where we state that 2ay ¥= 0. 12.10. Let q be the quadratic form associated with the symmetric bilinear form /. Verify the following polar form of /: fiu,v) - ^q{u + v) - q{u) - q{v)). (Assume that 1 + 1^0.) q(u + v) — q{u) — qiv) = f(u + v,u + v) — f{u, u) — f(v, v) = f(u, u) + f{u, v) + f{v, u) + fiy, v) - f{u, u) - f(v, v) = 2f{u,v) If 1 + 1 7^ 0, we can divide by 2 to obtain the required identity. 12.11. Prove Theorem 12.4: Let / be a symmetric bilinear form on V over K (in which 1 + 1^0). Then V has a basis {vi, . . .,Vn) in which / is represented by a diagonal matrix, i.e. f{Vi, v,) = for i ¥- j. Method 1. If / = or if dim V — 1, then the theorem clearly holds. Hence we can suppose f ¥= Q and dim V = n> 1. If q(v) = f(v, v) - for every v&V, then the polar form of / (see Problem 12.10) implies that / = 0. Hence we can assume there is a vector t?i e V such that f(Vi, v^ ¥= 0. Let U be the subspace spanned by Vi and let W consist of those vectors 1; G y for which /(^i, v) = 0. We claim that V = U ®W. (i) Proof that UnW = {0}: Suppose uG U nW. Since ue^U, u — kv^ for some scalar k&K. Since uGW, = f{u,u) = f(kvi,kvi) = k^ f(Vi,Vi). But /(^i.-Wi) ?^ 0; hence k-0 and there- fore u = kvi = 0. Thus UnW = {0}. CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 271 (il) Proof that V = U + W: Let vG V. Set Then f{v^,w) = f(v„v) - ;^^/(^i'^i) = " Thus w G W. By (1), v is the sum of an element of U and an element of W. Thus V = U + W. By (i) and (ii), V = U ® W. Now / restricted to W is a symmetric bilinear form on W. But dim W — n. — 1; hence by induction there is a basis {^2. • • • . v„} of W such that f(v^, Vj) = for i ¥= j and 2 — i, j — n. But by the very definition of W, fiv^, Vj) = for j = 2, . . .,n. Therefore the basis {v-^, . . . , v„} of V has the required property that /(-Uj, Vj) = for i ¥= j. Method 2. The algorithm in Problem 12.9 shows that every symmetric matrix over K is congruent to a diagonal matrix. This is equivalent to the statement that / has a diagonal matrix representation. 12.12. Let A = I ^ 1 , a diagonal matrix over K. Show that: (i) for any nonzero scalars fci, . . . , fc„ e If , A is congruent to a diagonal matrix with diagonal entries aifcf ; (ii) if K is the complex field C, then A is congruent to a diagonal matrix with only I's and O's as diagonal entries; (iii) if K is the real field K, then A is congruent to a diagonal matrix with only I's, —I's and O's as diagonal entries. (i) Let P be the diagonal matrix with diagonal entries fc^. Then ptAP = I ^"2 11 02 02^2 O-nKl fl/Voi if «{ '^ (ii) Let P be the diagonal matrix with diagonal entries f>i — "] -. •« _ r, • Then P^AP has the required form. fl/A/hl if Oi^O (iii) Let P be the diagonal matrix with diagonal entries 6j = ■< _ . Then P^AP has the required form. ^ ' Remark. We emphasize that (ii) is no longer true if congruence is replaced by Hermitian con- gruence (see Problems 12.40 and 12.41). 12.13. Prove Theorem 12.5: Let / be a symmetric bilinear form on V over R. Then there is a basis of V in which / is represented by a diagonal matrix, and every other diagonal representation of / has the same number of positive entries and the same number of negative entries. By Theorem 12.4, there is a basis {itj, . . . , u^} of V in which / is represented by a diagonal matrix, say, with P positive and N negative entries. Now suppose {wi, . . ., w„} is another basis of V in which / is represented by a diagonal matrix, say, with P' positive and N' negative entries. We can assume without loss in generality that the positive entries in each matrix appear first. Since rank (f) - P + N = P' + N', it suffices to prove that P = P'. 272 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 Let [7 be the linear span of u^, . .., up and let W be the linear span of Wp, + 1, . . . , m)„. Then f{v,v) > for every nonzero v e U, and f{v,v) ^ for every nonzero v & W. Hence UnW = {0}. Note that dimU = P and dimW = n- P'. Thus dim{U+W) = dimU + dimW - dim(UnW) = P + {n-P')-0 = P - P' + n But dim(U+W) ^ dim V = n; hence P-P' + n^n or P ^ P'. Similarly, P' ^ P and there- fore P = P', as required. Remark. The above theorem and proof depend only on the concept of positivity. Thus the theorem is true for any subfield K of the real field R. 12.14. An nxn real symmetric matrix A is said to be positive definite if X*AX > for every nonzero (column) vector X G R", i.e. if A is positive definite viewed as a bilinear form. Let B be any real nonsingular matrix. Show that (i) B*B is sym- metric and (ii) B*B is positive definite. (i) {Btpy = ptP" = B<-B\ hence B'JS is symmetric. (ii) Since B is nonsingular, BX # for any nonzero X S R". Hence the dot product of BX with itself, BX-BX = (BXY(BX), is positive. Thus XHBtB)X = {XtBt){BX) = (BX)HBX) > as required. HERMITIAN FORMS 12.15. Determine which of the following matrices are Hermitian: 2-i 4 + i\ 6 i i 3 / (ii) A matrix A = (Oj^) is Hermitian iff A = A*, i.e. iflf o.^ = 'ajl. (i) The matrix is Hermitian, since it is equal to its conjugate transpose, (ii) The matrix is not Hermitian, even though it is symmetric, (iii) The matrix is Hermitian. In fact, a real matrix is Hermitian if and only if it is symmetric. 12.16. Let A be a Hermitian matrix. Show that / is a Hermitian form on C« where / is defined by f(X, Y)^X<^A Y. For all a, 6 e C and all X^, X^, Y e C", /(aXi + 6X2, y) = (aX^ + hX^YAY = (aX\+hXl)AY = aX\AY + bXlAY = af(Xi,Y) + bf(X2,Y) Hence / is linear in the first variable. Also, f{X, Y) = xTaY = (XTaYY = Wa^X = YtA*X = Yt A X = f(Y,X) Hence / is a Hermitian form on C". (Remark. We use the fact that X* AY is a. scalar and so it is equal to its transpose.) 12.17. Let / be a Hermitian form on V. Let H be the matrix of / in a basis {d, . . . , e„} of V. Show that: (i) f{u,v) = [u]lH\v]e tor al\ u,vGV; (ii) if P is the transition matrix from {ei} to a new basis {e,'} of V, then B = P*HP (or: B-Q*HQ where Q = P) is the matrix of / in the new basis {e,'}. Note that (ii) is the complex analog of Theorem 12.2. (i) Let u,v GV and suppose u = ajCi + 0362 + • • • + a„e„ and v — b^ei + 62^2 + • • • + &««« CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 273 Then f{u, v) = fia^Ci + • • • + a„e„, 6161 + • • ■ + 6„e„) as required. \ " (ii) Since P is the transition matrix from {ej to {e^}, then P[u],. = [u]„ P[v],, = [v], and so [m]^ = [m]*, P', [^= P ['^ Thus by (i), f(u,v) = {u]\H [v]^ — [u\l,P^HP [v]^,. But u and v are arbitrary elements of V; hence P* H P is the matrix of / in the basis {el}. 1 1 + i 2i \ 12.18. Let H = \ 1 - i 4 2 — 3i,a Hermitian matrix. Find a nonsingular ma- -2i 2 + Si 1 j trix P such that P*HP is diagonal. First form the block matrix {H, I): 1 1 + i 2i 1 - j 4 2 - 3i -2i 2 + 3i 7 Apply the row operations R2 -* (—1 + 'i)Ri + R2 ^^^ ^3 ^ 2i/2i + i?3 to (A, /) and then the corresponding "Hermitian column operations" (see Problem 12.42) C2 -> (—1 — t)Ci + C2 and C3 -» — 2iCi + Cg to A to obtain and then Next apply the row operation R^ -> — SiBg + Zi^a and the corresponding Hermitian column opera- tion C3 -* hiC^ + 2C3 to obtain and then Now H has been diagonalized. Set '1 -1 + i 5 + 9i^ P = I 1 -hi ,0 2 and then P^HP Observe that the signature S of H is jS = 2 — 1 = 1. MISCELLANEOUS PROBLEMS 12.19. Show that any bilinear form / on F is the sum of a symmetric bilinear form and a skew symmetric bilinear form. Set g(v,,v) = ^[f{u,v) + f(v,u)\ and h{u,v) = ^[f(u,v) — f(v,u)\. Then g is symmetric because g{u,v) = ^[f(u,v) + f(v,u)] = ^[f{v,u) + f{u,v)] = g(v,u) and /i is skew symmetric because h{u,v) = ^[f{u,v) — f{v,u)] = -^[/(v.m) -/(«,-!))] = -h(v,u) Furthermore, f — g + h. 274 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 12.20. Prove Theorem 12.3: Let / be an alternating bilinear form on V. Then there exists a basis of V in which / is represented by a matrix of the form 1 I -1 I 1-10 1 L_oj T, To" 1 Moreover, the number of ( a M^ uniquely determined by / (because it is equal to i[rank (/)]). \'^ "/ If / = 0, then the theorem is obviously true. Also, if dim y = 1, then fik^u, fcaw) = fcifcj f(u, m) = and so / = 0. Accordingly we can assume that dim F > 1 and f ¥- 0. Since / # 0, there exist (nonzero) Wj, Wg G 7 such that /(mj, u^) # 0. In fact, multiplying Ml by an appropriate factor, we can assume that f{ui, u^ = 1 and so /(m2, "i) = —1- Now Ui and M2 are linearly independent; because if, say, u^ = fewj, then /(mj, u^) = /(mi, ku^) = k f(ui, u^) = 0. Let C7 be the subspace spanned by Ml and M2, i.e. U = L{ux,u^. Note: / n 1 (i) the matrix representation of the restriction of / to ?7 in the basis {mi, Mg} is ( _ ^ ^ (ii) if uG U, say u = aui + 6M2, then /(m. Ml) = /(aMi + 6m2, Ml) = —6 f(u, M2) = f(aui + bu2, M2) = "' Let TF consist of those vectors wGV such that /(w,Mi) = and /(w,M2) = 0. Equivalently, W = {wGV: f(w, m) = for every uGU} We claim that V=U®W. It is clear that UnW = {0}, and so it remains to show that V = U+W. Let veV. Set M = f{v, u^uy - /(■«, Mi)m2 and w = v -u U) Since M is a linear combination of Mi and Mj, uGU. We show that weW. By (i) and (ii), f(u,U]) = f(v,Ui); hence /(W, Ml) = f(v - U, Ml) = fiv, Ml) - /{m. Ml) = Similarly, /(m, M2) = f(v, Mg) and so f{w, M2) = /(f - M, M2) = /(v, U2) - /(m, M2) = Then w G TF and so, by (i), v = m + w where m G f/ and w&W. This shows that V^U+W\ and therefore y = ?7 © TF. Now the restriction ot f to W is an alternating bilinear form on W. By induction, there exists a basis M3, . . .,m„ of IF in which the matrix representing / restricted to W has the desired form. Thus Mi,M2,M3,. . .,M„ is a basis of V in which the matrix representing / has the desired form. (ii) Find the matrix of / in the basis -'' ^ °^ (^ ^^ 1^ °^ 1^ ° CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 275 Supplementary Problems BILINEAR FORMS 12.21. Let M = (xi, x^ and v = (i/x, y^. Determine which of the following are bilinear forms on R^: (i) /(m, V) = 2a;i2/2 - SajjJ/i (iv) f(u, v) = X1X2 + j/iJ/2 (ii) f(u, v) = xi + 3/2 (v) f{u, v) = 1 (iii) /(m,v) = 3a;2V2 (vi) /(w, v) = 0. 12.22. Let / be the bilinear form on R2 defined by /((«!, X2), (vi, 2/2)) = SxiVi - 2x12/2 + 4x2yi - a;2?/2 (i) Find the matrix A of / in the basis {u^ = (1, 1), Mj = (1, 2)}. (ii) Find the matrix B of / in the basis {v^ = (1, —1), V2 = (3, 1)}. (iii) Find the transition matrix P from {mJ to {vj and verify that B = P* A P. 12.23. Let V be the vector space of 2 X 2 matrices over B. Let Af = ( ^ j, and let f{A,B) = tr (At MB) where A,B&V and "tr" denotes trace, (i) Show that / is a bilinear form on V. '1 0\ /O 1\ /o 0\ /O 0^ o)'\o oj' [1 o)'[q 1, 12.24. Let B{V) be the set of bilinear forms on V over K. Prove: (i) if f,gGB(V), then f + g and kf, for k G K, also belong to BiV), and so B(V) is a subspace of the vector space of functions from V X V into K; (ii) if and a are linear functionals on V, then f{u, v) = ,p{u) a(v) belongs to B{V). 12.25. Let / be a bilinear form on V. For any subset S of V, we write S-*- = {ve^V: f{u,v) = for every mSS}, S""" = {vGV : f{v,u) = foreverywGS} Show that: (i) S"*" and S^ are subspaces of V; (ii) SjCSg implies S2 c st and sj C s} ; (iii) {0}-L ={0}T =y. 12.26. Prove: If / is a bilinear form on V, then rank (/) = dim y - dimy-*" = dim y - dimy^ and hence dim y-*" = dim y^. 12.27. Let / be a bilinear form on V. For each uGV, let u .V -^ K and u iV -* K be defined by u(x) = f(x,u) and u{x)=f(u,x). Prove: As. >^ ^ ^ (1) u and u are each linear, i.e. u,u G V*; (ii) u h* u and m M it are each linear mappings from V into V*; (iii) rank (/) = rank (m K m) = rank {u K tt). 12.28. Show that congruence of matrices is an equivalence relation, i.e. (i) A is congruent to A; (ii) if A is congruent to B, then B is congruent to A; (iii) if A is congruent to B and B is congruent to C, then A is congruent to C. SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS 12.29. Find the symmetric matrix belonging to each of the following quadratic polynomials: (i) q(x, y, z) = 2x^ - Sxy + y^ - IGxz + liyz + 5«2 (ii) q(x, y, z) = x^ — xz + y^ (iii) q(x, y, z) — xy + y^ + 4xz + z^ (iv) q(x, y, z) = xy + yz. 276 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 12.30. For each of the following matrices A, find a nonsingular matrix P such that P'^AP is diagonal: 1 1-2-3^ 2 3\ / I ! ^ \ (12-5-1 -2-5 6 9 -3 -1 9 11 (i) ^ = (3 4)' («) A = -2 6 -9 . (iii) A = , 2 _g g In each case find the rank and signature. 12.31. Let q be the quadratic form associated with a symmetric bilinear form /. Verify the following alternate polar form of /: f(u,v) = ^[q(u + v) — q(u — v)]. 12.32. Let S(V) be the set of symmetric bilinear forms on V. Show that: (i) S{V) is a subspace of B(V); (ii) if dim V = n, then dim S(y) = ^n(n + 1). 12.33. Let / be the symmetric bilinear form associated with the real quadratic form qix,y) = ax^ + bxy + cy^. Show that: (i) / is nondegenerate if and only if b^ — 4ac ¥= 0; (ii) / is positive definite if and only if a > and b^ — 4ac < 0. 12.34. Suppose A is a real symmetric positive definite matrix. Show that there exists a nonsingular matrix P such that A = P«P. n 12.35. Consider a real quadratic polynomial qix^, . . . , a;„) = 2 Oij^i^j, where ay = ttjj. ijj = 1 (i) If an '^ 0, show that the substitution Xi = Vi (ai22/2 + • • ■ + «-ln2/n). «2 = 2/2. ••■' «« = 2/n aji yields the equation q{xi x^) = a^yl + q'iVz, ■ ■ -yVn), where q' is also a quadratic polynomial. (ii) If an = but, say, ajg ^ 0, show that the substitution xi-yi + 2/2. X2 = yi- V2> «3 = 2/3. • • • . ^n = Vn yields the equation q{xi,...,x„) = 2 Mi^/j- "^^^^^ ^n ^^ 0. i.e. reduces this case to case (i). This method of diagonalizing q is known as "completing the square". 12.36. Use steps of the type in the preceding problem to reduce each quadratic polynomial in Problem 12.29 to diagonal form. Find the rank and signature in each case. HERMITIAN FORMS 12.37. For any complex matrices A, B and any k e C, show that: (i) ATB = A+B, (ii) M^JcA, (iii) AB = AB, (iv) A* = A^. 12.38. For each of the following Hermitian matrices H, find a nonsingular matrix P such that P* H P is diagonal: , . ^^., Find the rank and signature in each case. 12.39. Let A be any complex nonsingular matrix. Show that H = A*A is Hermitian and positive definite. 12.40. We say that B is Hermitian congruent to A if there exists a nonsingular matrix Q such that B = Q*AQ. Show that Hermitian congruence is an equivalence relation. CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 277 12.41. Prove Theorem 12.7: Let / be a Hermitian form on V. Then there exists a basis {e^, . . .,e„} of V in which / is represented by a diagonal matrix, i.e. /(ej, ej) = for t # j. Moreover, every diagonal representation of / has the same number P of positive entries and the same number N of negative entries. (Note that the second part of the theorem does not hold for complex symmetric bilinear forms, as seen by Problem 12.12(ii). However, the proof of Theorem 12.5 in Problem 12.13 does carry over to the Hermitian case.) MISCELLANEOUS PROBLEMS 12.42. Consider the following elementary row operations: [ai] Bi <^ Rj, [oj Ri -* kRi, k ¥- 0, [og] jB; ^ kRj + fij The corresponding elementary column operations are, respectively, Ih] Ci^Cj, [bzl Ci-^kCi,ky^O, [bs] d^kCj + Ct If K is the complex field C, then the corresponding Hermitian column operations are, respectively, [ci] Ci'^Ci, [cj Cj^feCj, fc^O, [cs] Ci-*kCj + Ci (i) Show that the elementary matrix corresponding to [6J is the transpose of the elementary matrix corresponding to [a^]. (ii) Show that the elementary matrix corresponding to [cj is the conjugate transpose of the ele- mentary matrix corresponding to [aj. 12.43. Let V and W be vector spaces over K. A mapping / : VxW-^ K is called a bilinear form on V and W if: (i) f(avi + bv2, w) = af(vi, w) + bfiv^, w) (ii) f(v, owj + hWi) = af(v, Wj) + bf^v, w^ for every a,bQK, V; G V, w^ £ W. Prove the following: (i) The set B(V, W) of bilinear forms on V and W is a subspace of the vector space of functions from VXW into K. (ii) If {^1, ...,0^} is a basis of V* and {aj, ...,»„} is a basis of W*, then {/y : i=l,...,m, j = l,...,n} is a basis of B(V,W) where /y is defined by fij(v,w) - 4>i(v) aAw). Thus dim B(V, W) = dim F • dim W. {Remark. Observe that if V = W, then we obtain the space B{V) investigated in this chapter.) m times 12.44. Let y be a vector space over K. A mapping f: VXVX---xV-*K is called a multilinear (or: m-linear) form on V if / is linear in each variable, i.e. for i = l, ...,m, f{...,au+bv, ...) = af(...,u, ...) + bf(...,v, ...) where ^ denotes the ith component, and other components are held fixed. An »i-linear form / is said to be alternating if f{vi, . . . , Vf^) = whenever Vj = v^, i^^ k Prove: (i) The set B^(V) of m-linear forms on V is a subspace of the vector space of functions from VXVX ■■■ XV into K. (ii) The set A,„(y) of alternating m-linear forms on V is a subspace of B^{V). Remark 1. If m = 2, then we obtain the space B(F) investigated in this chapter. Remark 2. If. V = K'", then the determinant function is a particular alternating m-linear form on V. 278 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 Answers to Supplementary Problems 12.21. (i) Yes (ii) No (iii) Yes (iv) No (v) No (vi) Yes 12.22. (i) A = 4 1 7 3 (ii) B -4 20 32 (iii) P = 3 5 -2 -2 12.23. (ii) 2 -4 -8' 12.29. (i) 1-417 1-8 7 5, (ii) 1 -i^ 10 -i 0; '0 (iii) i 2' i 1 2 1/ 'o i o' (iv) U 1 .0 4 0, 12.30. (i) P = (ii) P - (iii) P = 1 -3 2 PtAP 2 -2 S = 0. '1 2 0\ /^ ° ° 1 3 , PtAP =02 2/ loo -38 y S = 1. /l -1 -1 26 \ 1 3 13 19 \o 7 PtAP = S = 2. 12.38. (i) P = (ii) P = (iii) P = 1 A ptHP = ^1 » 1 1 -2 + 3i 1 1 i -3 + i 1 i 1 1 PtHP = PtHP S = 2. -14 '1 0' 10 lO 0-4, S = 1. chapter 13 Inner Product Spaces INTRODUCTION The definition of a vector space V involves an arbitrary field K. In this chapter we restrict K to be either the real field R or the complex field C. In the first case we call V a real vector space, and in the second case a complex vector space. Recall that the concepts of "length" and "orthogonality" did not appear in the investiga- tion of arbitrary vector spaces (although they did appear in Chapter 1 on the spaces R" and C"). In this chapter we place an additional structure on a vector space V to obtain an inner product space, and in this context these concepts are defined. We emphasize that V shall denote a vector space of finite dimension unless otherwise stated or implied. In fact, many of the theorems in this chapter are not valid for spaces of infinite dimension. This is illustrated by some of the examples and problems. INNER PRODUCT SPACES We begin with a definition. Definition: Let F be a (real or complex) vector space over K. Suppose to each pair of vectors u,v GV there is assigned a scalar {u, v) G K. This mapping is called an inner product in V if it satisfies the following axioms: [/i] {aui, + hu2, V) = a(ui, v) + h{U2, v) [h] {u,v) = (v/u) [la] {u, u) s= 0; and {u, m) = if and only if u = 0. The vector space V with an inner product is called an inner product space. Observe that {u,u) is always real by [72], and so the inequality relation in [h] makes sense. We also use the notation ||m|| = '\/{u, u) This nonnegative real number ||m|| is called the norm or length of u. Also, using [/i] and [/2] we obtain (Problem 13.1) the relation {u, avi + bv2) = d{u, Vi) + b{u, V2) If the base field K is real, the conjugate signs appearing above and in [/2] may be ignored. In the language of the preceding chapter, an inner product is a positive definite sym- metric bilinear form if the base field is real, and is a positive definite Hermitian form if the base field is complex. A real inner product space is sometimes called a Euclidean space, and a complex inner product space is sometimes called a unitary space. 279 280 INNER PRODUCT SPACES [CHAP. 13 Example 13.1 : Consider the dot product in R": M • v = ajfei + a262 + • • • + Ok6„ where u — (aj) and v = (6;). This is an inner product on R", and R" with this inner product is usually referred to as Euclidean n-spa-ce. Although there are many different ways to define an inner product on R" (see Problem 13.2), we shall assume this inner product on R" unless otherwise stated or implied. Example 13.2: Example 13.3: Example 13.4: Example 13.5: Consider the dot product on C": where u — (Zj) and v = (Wj). As in the real case, this is an inner product on C" and we shall assume this inner product on C" unless otherwise stated or implied. Let V denote the vector space oi mXn matrices over R. The following is an inner product in V: (A,B) = tr (B*A) where tr stands for trace, the sum of the diagonal elements. Analogously, if U denotes the vector space of m X n matrices over C, then the following is an inner product in U: {A,B} = tr(B*A) As usual, B* denotes the conjugate transpose of the matrix B. Let V be the vector space of real continuous functions on the interval a — t — b. Then the following is an inner product on V: {f,g) = f m)g(t)dt •'a Analogously, if V denotes the vector space of complex continuous functions on the (real) interval a - t - h, then the following is an inner product on U: if, 9) = f m^dt •'a Let V be the vector space of infinite sequences of real numbers (oj, a2, . . .) satisfying 2,2 2 of = ai + a2 + t=i < M i.e. the sum converges. Addition and scalar multiplication are defined component- wise: (tti.aa, ...) + (61,62. • ••) = (011 + 61,02+62,...) fe(ai, a2, . ■ .) = (.kai, ka2, ■ ■ ■) An inner product is defined in V by <(«!, tta, ...), (61, 62, ... )) = ai6i + a2&2 + ' ■ ' The above sum converges absolutely for any pair of points in V (Problem 13.44); hence the inner product is well defined. This inner product space is called J2-space (or: Hilbert space). Remark 1: If ||f|| = 1, i.e. if {v,v) = 1, then v is called a unit vector or is said to be normalized. We note that every nonzero vector u &V can be normalized by setting V = m/||%||. Remark 2: The nonnegative real number d{u,v) = \\v-u\\ is called the distance between u and v; this function does satisfy the axioms of a metric space (see Problem 13.51). CHAP. 13] INNER PRODUCT SPACES 281 CAUCHY-SCHWARZ INEQUALITY The following formula, called the Cauchy-Schwarz inequality, is used in many branches of mathematics. Theorem 13.1: (Cauchy-Schwarz): For any vectors u,vGV, \{u,v)\ — ||m|| II^II Next we examine this inequality in specific cases. Example 13.6: Consider any complex numbers aj, . . .,a„, 6i, . . ., 6„ G C. Then by the Cauchy- Schwarz inequality, (ai6i+---+a„6„P =^ iWi\^ + ■ ' ■ + KMh]^ + ■ ■ ■ + K\^) that is, (m-v)2 ^ Ikll^ll-ulP where u = (aj) and v = (6;). Example 13.7; Let / and g be any real continuous functions defined on the unit interval — t — 1. Then by the Cauchy-Schwarz inequality, «/,ff»2 = (^f f{t)g(t)dty =s ^^ P(t)dt ^\\t)dt = \W\\g\\^ Here V is the inner product space of Example 13.4. ORTHOGONALITY Let V be an inner product space. The vectors u,v GV are said to be orthogonal if {u/v) = 0. The relation is clearly symmetric; that is, if u is orthogonal to v, then {v, u) = {u,v} = = and so v is orthogonal to u. We note that e V is orthogonal to every V e 7 for {0,v) ^ {Qv,v) = 0{v,v) = Conversely, if u is orthogonal to every v GV, then (u.u) = and hence m = by [1 3]. Now suppose W is any subset of V. The orthogonal complement of W, denoted by W-^ (read "W perp") consists of those vectors in V which are orthogonal to every w GW: W-'- = {v GV: (v,w) = for every w GW} We show that W^ is a subspace of V. Clearly, GW^. Now suppose u,v G W-^. Then for any a,h GK and any w GW, {au + bv, w) = a{u, w) + h{v, w) - a-0 + b'0 = Thus au + bv €. Theorem 13i!: W and therefore W is a subspace of V. Let W he a subspace of V. Then V is the direct sum of W and W^ , v^w®w^. I.e. Now if PF is a subspace of V, then V =W ® W-^ by the above theorem; hence there is a unique projection Ew: F -> 7 with image W and kernel W-^ . That is, if v GV and v = w + w', where w GW, w' G W^, then Ew is defined by Ew{v) = w. This mapping Evf is called the orthogonal projection of V onto W. Example 13.8: Let W be the z axis in B^, i.e. W = {(0,0,c): cSR} Then W is the xy plane, i.e. W^ = {{a, 6, 0) : a, 6 e R} As noted previously, R^ = W ® W . The orthogonal projection E of R^ onto W is given by E{x, y, z) = (0, 0, z). 282 INNER PRODUCT SPACES [CHAP. 13 Example 13.9: Consider a homogeneous system of linear equations over R: ffliia;! + fflijaig + • • • + »!„»;„ = a2iXi + 022*2 + • • • + a2n«7i = ttml*! + «m2«'2 + or in matrix notation AX = 0. Recall that the solution space W may be viewed as the kernel of the linear operator A. We may also vievs^ W as the set of all vectors V = (xi, ...,«„) vrhich are orthogonal to each row of A. Thus W is the orthogonal complement of the row space of A. Theorem 13.2 then gives another proof of the fundamental result: dim W = n — rank (A). Remark: If F is a real inner product space, then the angle between nonzero vectors u,v GV is defined by (u, V) cos 6 - \\u\\m By the Cauchy-Schwarz inequality, -1 =^ cos ^ ^ 1 and so the angle 6 always exists. Observe that u and v are orthogonal if and only if they are "perpendicu- lar", i.e. e - 7r/2. ORTHONORMAL SETS A set {Ui} of vectors in V is said to be orthogonal if its distinct elements are orthogonal, i.e. if {Vd, Uj) = for i¥= j. In particular, the set {Ui} is said to be orthonormal if it is orthogonal and if each Ui has length 1, that is, if r for i¥' j {Ui, Uj) = Sij = < . . [1 for I — J An orthonormal set can always be obtained from an orthogonal set of nonzero vectors by normalizing each vector. Example 13.10: Consider the usual basis of Euclidean 3-space R^: {ei = (1, 0, 0), 62 = (0. 1. 0). «3 = (0. 0, 1)} It is clear that (ei, ej) - (62, 62) = (63, 63) = 1 and (fij, e^) = for i¥=j That is, {ei, e^, 63} is an orthonormal basis of R^. More generally, the usual basis of R" or of C" is orthonormal for every n. Example 13.11: Let V be the vector space of real continuous functions on the interval -v-t-v with inner product defined by {f,g)= I f(t) g{t) dt. The following is a classi- cal example of an orthogonal subset of V: {1, cos t, cos 2«, . . ., sin t, sin 2t, . . .} The above orthogonal set plays a fundamental role in the theory of Fourier series. The following properties of an orthonormal set will be used in the next section. Lemma 13^: An orthonormal set (mi, ...,Ur}\s linearly independent and, for any v&V, the vector ^^ _ ^ _ ^^^ ^^^^^ _ ^^^ ^2)^2 - ■■■ - {v, Ur)Ur is orthogonal to each of the Uu CHAP. 13] INNER PRODUCT SPACES 283 GRAM-SCHMIDT ORTHOGONALIZATION PROCESS Orthonormal bases play an important role in inner product spaces. The next theorem shows that such a basis always exists; its proof uses the celebrated Gram-Schmidt orthog- onalization process. Theorem 13.4: Let {vi, . . .,Vn} be an arbitrary basis of an inner product space V. Then there exists an orthonormal basis {ui, . . . , tt„} of V such that the transition matrix from {Vi} to {%} is triangular; that is, for i = l, . . .,n, Ui = OiiVi + 012^2 + ■ • • + auVi Proof. We set Ui = Vi/||t;i||; then {ui} is orthonormal. We next set W2 = V2 — {V2,Ui)Ui and M2 = W2/||w2|| By Lemma 13.3, W2 (and hence U2) is orthogonal to Ui; then {ui,v^} is orthonormal. We next ggf- W3 = va — {V3,Ui)Ui — {vs,U2)U2 and Uz = Ws/\\ws\\ Again, by Lemma 13.3, Wa (and hence Ua) is orthogonal to Ui and U2; then {ui, iia, Ua} is ortho- normal. In general, after obtaining {ui, ...,2*1} we set Wi+i = Vi+i — {Vi+i,ui)ui — ••■ — (Vi+1, 1^)111 and th+i = Wi+i/\\Wi+i\\ (Note that Wi+i¥'0 because Vi+i€ L{vi, . . .,Vi) — L{ui, . . .,ik).) As above, (tti, . . .,Mi+i} is also orthonormal. By induction we obtain an orthonormal set {ui, . . . , ztn} which is in- dependent and hence a basis of V. The specific construction guarantees that the transition matrix is indeed triangular. Example 13.12: Consider the following basis of Euclidean space R^: {^1 = (1, 1, 1), V2 = (0, 1, 1), ■V3 = (0, 0, 1)} We use the Gram-Schmidt orthogonalization process to transform {vj into an ortho- normal basis {Ui). First we normalize v^, i.e. we set JVi_ ^ (1,1,1) ^ /J 1_ _1_ Next we set U,2 = .2-<^.%)% = (0,1,1) -j=(^j=.^,jl'j = (-f,|,| and then we normalize Wg, i.e. we set W2 _ / 2 1_ J_ "^ ~ IKII ~ [ Ve'Ve'Ve Finally we set W3 = '"a - <'>'3.Ml>Ml - {'«3,U2)U2 = (0,0,1)-— (^-^,—,-^j--^^--|.-^,-|j = (o,-|,| and then we normalize Wg: _ Wa I 11 \/2 y/2 The required orthonormal basis of R^ is LINEAR FUNCTIONALS AND ADJOINT OPERATORS Let V be an inner product space. Each u&V determines a mapping u:V-^K defined by u{v) — {v,u) 284 INNER PRODUCT SPACES [CHAP. 13 Now for any a,b G K and any Vi, V2 G V, u(avi + bv2) — {avi + bv2,u) = a{vi,u} + b{V2,u) = au{vi) + bu{v2) That is, M is a linear functional on V. The converse is also true for spaces of finite dimen- sion and is an important theorem. Namely, Theorem 13.5: Let ^ be a linear functional on a finite dimensional inner product space V. Then there exists a unique vector uGV such that (j>{v) = (v, u) for every v&V. We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.45), although some general results in this direction are known. (One such famous result is the Riesz representation theorem.) We use the above theorem to prove Theorem 13.6: Let T be a linear operator on a finite dimensional inner product space V. Then there exists a unique linear operator T* on F such that {T{u),v) = {u,T*{v)) for every m, v G V. Moreover, if A is the matrix of T relative to an orthonormal basis {Ci] of V, then the conjugate transpose A* of A is the matrix of T* in the basis {ei}. We emphasize that no such simple relationship exists between the matrices representing T and T* if the basis is not orthonormal. Thus we see one useful property of orthonormal bases. Definition: A linear operator T on an inner product space V is said to have an adjoint operator T* on V if {T{u),v) = {u,T*{v)) for every u,vGV. Thus Theorem 13.6 states that every operator T has an adjoint if V has finite dimension. This theorem is not valid if V has infinite dimension (Problem 13.78). Example 13.13: Let T be the linear operator on C^ defined by T(x, y, z) = {2x + iv,y- Mz, x + {l-i)y + Sz) We And a similar formula for the adjoint T* of T. Note (Problem 7.3) that the matrix of T in the usual basis of C^ is /2 i [r] = 1 -5i \1 1-t 3 Recall that the usual basis is orthonormal. Thus by Theorem 13.6, the matrix of T* in this basis is the conjugate transpose of [T]: [T*] = Accordingly, T*(x,y,z) = (2x + z,-ix + y + {l + i)z,5iy + Bz) The following theorem summarizes some of the properties of the adjoint. Theorem 13.7: Let S and T be linear operators on V and let kGK. Then: (i) (S + T)* = S* + T* (iii) (ST)* = T*S* (ii) (kT)* = kT* (iv) (T*)* = T CHAP. 13] INNER PRODUCT SPACES 285 ANALOGY BETWEEN A{V) AND C, SPECIAL OPERATORS Let A(V) denote the algebra of all linear operators on a finite dimensional inner product space V. The adjoint mapping T^^ T* on A{V) is quite analogous to the conjugation map- ping zi-* z on the complex field C. To illustrate this analogy we identify in the following table certain classes of operators T G A(V) whose behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex numbers. Class of complex numbers Behavior under conjugation Class of operators in A(V) Behavior under the adjoint map Unit circle (|z| = 1) z - 1/z Orthogonal operators (real case) Unitary operators (complex case) r* = r-i Real axis z = z Self -ad joint operators Also called: symmetric (real case) Hermitian (complex case) T* = T Imaginary axis z — —z Skew-adjoint operators Also called: skew-symmetric (real case) skew-Hermitian (complex case) r* =: -T Positive half axis (0,«) z — WW, w =/= Positive definite operators T = S*S with S nonsingular The analogy between these classes of operators T and complex numbers z is reflected in the following theorem. Theorem 13.8: Let A be an eigenvalue of a linear operator T on V. (i) If T* = T-\ then |A.| = 1. (ii) If T* = T, then k is real. (iii) If T* = —T, then X is pure imaginary. (iv) If r = S*S with S nonsingular, then X is real and positive. We now prove the above theorem. In each case let v be a nonzero eigenvector of T belonging to X, that is, T{v) - Xv with v y^ 0; hence {v, v) is positive. Proof of (i): We show that XX{v,v) - {v,v): XX{v,v) = {XV, XV) ^ {T(v),T{v)) = {v,T*T{v)) = {v,I(v)) = {v,v) But {v,v) ¥- 0; hence XX =1 and so |A.| = 1. Proof of (ii): We show that X{v, v) = X{v, v): X{v,v) = {XV, v) = {T{v),v) = {v,T*{v)) = {v,T{v)) = {v,Xv) = X{v,v) But {v, v) ¥- 0; hence A = X and so X is real. Proof of (iii): We show that X{v,v) = —X{v,v): X{v,v) = {XV, V) = {T{v),v) = {v,T*{v)) = {v,-T{v)) = {v,-Xv) = -X{v,v} But {v,v} ¥- 0; hence X = —X or A. = —A, and so A is pure imaginary. 286 INNER PRODUCT SPACES [CHAP. IS Proof of (iv): Note first that S{v)¥'0 because S is nonsingular; hence {S{v),S{v)) is positive. We show that X{v,v) = {S{v),S(v)y. \{v,v) = {\v,v) = {T{v),v) = {S*S{v),v} = {S{v),S{v)) But {v,v) and {S{v),S{v)) are positive; hence A is positive. We remark that all the above operators T commute with their adjoint, that is, TT* = T*T. Such operators are called normal operators. ORTHOGONAL AND UNITARY OPERATORS Let U he a. linear operator on a finite dimensional inner product space V. As defined above, if U* = C/-1 or equivalently UU* = U*U = / then Z7 is said to be orthogonal or unitary according as the underlying field is real or com- plex. The next theorem gives alternate characterizations of these operators. Theorem 13.9: The following conditions on an operator U are equivalent: (i) [/* = U-\ that is, UU* ^U*U = I. (ii) U preserves inner products, i.e. for every v,w GV, {U{v),U{w)) = {v,w) (iii) U preserves lengths, i.e. for every v &V, \[U{v)\\ = ||t;||. Example 13.14: Let T : RS ^ R3 be the linear operator which ^ T(v) rotates each vector about the z axis by a fixed angle e: T(x, y, z) - {x cose — y sin e, X sine + y cos 9, z) Observe that lengths (distances from the ori- gin) are preserved under T. Thus T is an orthogonal operator. Example 13.15: Let V be the ^space of Example 13.5. Let TiV^V be the linear operator de- fined by r(cii,a2. ■ ••) = (0, ai,<i2. ■•■)• Clearly, T preserves inner products and lengths. However, T is not surjective since, for example, (1, 0, 0, ... ) does not belong to the image of T; hence T is not invertible. Thus we see that Theorem 13.9 is not valid for spaces of infinite dimension. An isomorphism from one inner product space into another is a bijective mapping which preserves the three basic operations of an inner product space: vector addition, scalar multiplication, and inner products. Thus the above mappings (orthogonal and unitary) may also be characterized as the isomorphisms of V into itself. Note that such a mapping U also preserves distances, since \\U{v)-Uiw)\\ = \\U{v-w)\\ = \\v-w\\ and so U is also called an isometry. ORTHOGONAL AND UNITARY MATRICES Let C7 be a linear operator on an inner product space V. By Theorem 13.6 we obtain the following result when the base field K is complex. Theorem 13.10A: A matrix A with complex entries represents a unitary operator U (relative to an orthonormal basis) if and only if A* = A'^. On the other hand, if the base field K is real then A* = A'; hence we have the follow- ing corresponding theorem for real inner product spaces. CHAP. 13] INNER PRODUCT SPACES 287 Theorem 13.10B: A matrix A with real entries represents an orthogonal operator U (relative to an orthonormal basis) if and only if A* = A~^. The above theorems motivate the following definitions. Definition: A complex matrix A for which A* = A"S or equivalently A A* = A*A = /, is called a unitary matrix. Definition: A real matrix A for which A* — A~^, or equivalently AA* = A*A = I, is called an orthogonal matrix. Observe that a unitary matrix with real entries is orthogonal. Example 13.16: Suppose A = ( ' ^ ) is a unitary matrix. Then A A* = I and hence AA* = /"' "A/^i *i\ = / l^il' + l^zP ttifti + ajftaA ^ /l o' ^ ^ 6i h^\a2 62/ \aj6i + a262 |6iP+|62l^/ \0 1 Thus |ai|2 + laaP = 1, |6i|2 + Ibgp = 1 and aj)^ + a^z = Accordingly, the rows of A form an orthonormal set. Similarly, A*A = I forces the columns of A to form an orthonormal set. The result in the above example holds true in general; namely, Theorem 13.11: The following conditions for a matrix A are equivalent: (i) A is unitary (orthogonal). (ii) The rows of A form an orthonormal set. (iii) The columns of A form an orthonormal set. Example 13.17: The matrix A representing the rotation T in Example 13.14 relative to the usual basis of R3 is (cos e — sin B sin e cos 9 1, As expected, the rows and the columns of A each form an orthonormal set; that is, A is an orthogonal matrix. CHANGE OF ORTHONORMAL BASIS In view of the special role of orthonormal bases in the theory of inner product spaces, we are naturally interested in the properties of the transition matrix from one such basis to another. The following theorem applies. Theorem 13.12: Let {ei, ...,««} be an orthonormal basis of an inner product space Y. Then the transition matrix from {ei) into another orthonormal basis is unitary (orthogonal). Conversely, if P = (Oij) is a unitary (orthogonal) matrix, then the following is an orthonormal basis: {el = aiiCi + 02162 + ■ • • + a„ie„ : i = 1, . . . , n} Recall that matrices A and B representing the same linear operator T are similar, i.e. B — P~^AP where P is the (nonsingular) transition matrix. On the other hand, if V is an inner product space, we are usually interested in the case when P is unitary (or orthog- onal) as suggested by the above theorem. (Recall that P is unitary if P* = P""S and P is orthogonal if P* = p-\) This leads to the following definition. 288 INNER PRODUCT SPACES [CHAP. 13 Definition: Complex matrices A and B are unitarily equivalent if there is a unitary matrix P for which B - P*AP. Analogously, real matrices A and B are orthogonally equivalent if there is an orthogonal matrix P for which B = P*AP. Observe that orthogonally equivalent matrices are necessarily congruent (see page 262). POSITIVE OPERATORS Let P be a linear operator on an inner product space V. P is said to be positive (or: semi-definite) if „ „ „ , rt P = S*S for some operator & and is said to be positive definite if S is also nonsingular. The next theorems give alternate characterizations of these operators. Theorem 13.13 A: The following conditions on an operator P are equivalent: (i) P = T^ for some self -adjoint operator T. (ii) P = S*S for some operator -S. (iii) P is self -adjoint and (P(m), u)^0 for every uGV. The corresponding theorem for positive definite operators is Theorem 13.13B: The following conditions on an operator P are equivalent: (i) P-T^ for some nonsingular self -ad joint operator T. (ii) P - S*S for some nonsingular operator S. (iii) P is self -adjoint and {P{u),u) > for every w ^ in V. DIAGONALIZATION AND CANONICAL FORMS IN EUCLIDEAN SPACES Let r be a linear operator on a finite dimensional inner product space V over K. Rep- resenting r by a diagonal matrix depends upon the eigenvectors and eigenvalues of T, and hence upon the roots of the characteristic polynomial A(t) of T (Theorem 9.6). Now ^^t) always factors into linear polynomials over the complex field C, but may not have any linear polynomials over the real field R. Thus the situation for Euclidean spaces (where X = R) is inherently different than that for unitary spaces (where K = C); hence we treat them separately. We investigate Euclidean spaces below, and unitary spaces m the next section. Theorem 13.14: Let T be a symmetric (self -ad joint) operator on a real finite dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an orthonormal basis. We give the corresponding statement for matrices. Alternate Form of Theorem 13.14: Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that B = P-^AP = P*AP is diagonal. We can choose the columns of the above matrix P to be normalized orthogonal eigen- vectors of A; then the diagonal entries of B are the corresponding eigenvalues. CHAP. 13] INNER PRODUCT SPACES 289 / 2 -2\ Example 13.18: Let A = ( j . We find an orthogonal matrix P such that P^AP is diagonal. The characteristic polynomial A(t) of A is A(t) = \tI-A\ = t-2 2 2 f- 5 («-6)(t-l) The eigenvalues of A are 6 and 1. Substitute « = 6 into the matrix tl - A to obtain the corresponding homogeneous system of linear equations 4x + 2y = 0, 2x + 2/ = A nonzero solution is v^ = (1, —2). Next substitute t = 1 into the matrix tl ~ A to find the corresponding homogeneous system -X + 2y = 0, 2x - 4y = A nonzero solution is (2, 1). As expected by Problem 13.31, v^ and V2 are orthogonal. Normalize Vi and V2 to obtain the orthonormal basis {«! = (1/-V/5, -2/V5), M2 = (2/V5, 1/V5)} Finally let P be the matrix whose columns are u^ and U2 respectively. Then / I/V5 2/V5\ /6 P = and P-iAP = PtAP = ( V-2/X/5 I/V5/ VO 1 As expected, the diagonal entries of P*AP are the eigenvalues corresponding to the columns of P. We observe that the matrix B - P~^AP = P*AP is also congruent to A. Now if q is a real quadratic form represented by the matrix A, then the above method can be used to diagonalize q under an orthogonal change of coordinates. This is illustrated in the next example. Example 13.19: Find an orthogonal transformation of coordinates virhich diagonalizes the quadratic form q{x, y) = 2x^ — 4xy + 5y^. / 2 -2\ The symmetric matrix representing Q is A = I ) . In the preceding example we obtained the orthogonal matrix ^ ' / I/V5 2/\/5\ /fi P = for which PtAP = ( „ " V-2/V/5 1/V5/ Vo 1 (Here 6 and 1 are the eigenvalues of A.) Thus the required orthogonal transforma- tion of coordinates is x\ /x'\ X = a;7\/5 + 22/'V5 ) = P[ ■>,') that is, ^ ^ Under this change of coordinates q is transformed into the diagonal form q(x',y') = 6x'^ + y'^ Note that the diagonal entries of q are the eigenvalues of A. An orthogonal operator T need not be symmetric, and so it may not be represented by a diagonal matrix relative to an orthonormal basis. However, such an operator T does have a simple canonical representation, as described in the next theorem. Theorem 13.15: Let T" be an orthogonal operator on a real inner product space V. Then there is an orthonormal basis with respect to which T has the following form: 290 INNER PRODUCT SPACES [CHAP. 13 COS Or — sin Or sin Or cos Or The reader may recognize the above 2 by 2 diagonal blocks as representing rotations in the corresponding two-dimensional subspaces. DIAGONALIZATION AND CANONICAL FORMS IN UNITARY SPACES We now present the fundamental diagonalization theorem for complex inner product spaces, i.e. for unitary spaces. Recall that an operator T is said to be normal if it com- mutes with its adjoint, i.e. if TT* = T* T. Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose, i.e. if AA* = A*A. Example 13.20: Let A = ( . „,„.). Then \i S + 2iJ\l S-2iJ 2 3 - 3i 3 + 3i 14 A*A = (^ -' V^ M = ^ ' ^"^* \1 S-2iJ\i 3 + 2iJ \3 + 3i 14 Thus A is a normal matrix. The following theorem applies. Theorem 13.16: Let T be a normal operator on a complex finite dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an orthonormal basis. We give the corresponding statement for matrices. Alternate Form of Theorem 13.16: Let A be a normal matrix. Then there exists a uni- tary matrix P such that B — P~^AP = P*AP is diagonal. The next theorem shows that even non-normal operators on unitary spaces have a relatively simple form. Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional inner product space V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V. CHAP. 13] INNER PRODUCT SPACES 291 Alternate Form of Theorem 13.17: Let A be an arbitrary complex matrix. Then there exists a unitary matrix P such that B = p-^AP = P*AP is triangular. SPECTRAL THEOREM The Spectral Theorem is a reformulation of the diagonalization Theorems 13.14 and 13.16. Theorem 13.18 (Spectral Theorem): Let T be a normal (symmetric) operator on a com- plex (real) finite dimensional inner product space V. Then there exist orthogonal projections Ei, . ..,Er on V and scalars Xi, ...,\r such that (i) T = XiEi + X2E2 + • • • + XrEr (ii) E1 + E2+ •■■ -^Er ^ I (iii) EiEj = for i¥= j. The next example shows the relationship between a diagonal matrix representation and the corresponding orthogonal projections. '2 Example 13.21 : Consider a diagonal matrix, say A = | | . Let The reader can verify that the Ei are projections, i.e. fif = Bj, and that (i) A = 2£7i + 3^2 + 5^3. (ii) ^1 + ^2 + ^3 = I, (iii) E^^ = for i¥= j Solved Problems INNER PRODUCTS 13.1. Verify the relation {u, avi + hvi) = d{u, Vi) + b{u, V2). Using [1 2], [Ii] and then [I2], we find {u, avi + bv2) = {avi + 6^2, u) = a{Vi,u) + b{V2, u) = d <«!,«> + 6 {^2, u) = a(u,Vi) + 6<M, V2> 13.2. Verify that the following is an inner product in R^: {u, V) = XiVi, - xiy2 - X2yi + 3x2y2, where u = {xi, X2), v = (2/1, 1/2). Method 1. We verify the three axioms of an inner product. Letting w = («i, Zz), we find au + bw = a{xi, Wj) + H^i, Z2) = (ffl«i + bzi, aX2 + 623) 292 INNER PRODUCT SPACES [CHAP. 13 Thus {au + bw, V) = {(axi + bzi, ax^ + 622), (2/1, V2)) = {ax^ + bziivi - (axi + 621)2/2 - (ax2 + 6x2)2/1 + 3(aa;2 + 623)2/2 = a(a;i2/i - a;iJ/2 - a!22/i + 3a;2?/2) + 6(212/1-213/2-222/1 + 3222/2) = a(u,v) + b{w,v) and so axiom [/i] is satisfied. Also, {v,u) = j/i^i - 2/1X2 — 2/2*^1 + %2*2 = «i2/i — a'i2/2 ~ *22/i + 3a;22/2 = {u,v} and axiom [/2] is satisfied. Finally, (M,M> = x^- 2x1X2 + 3x1 = Xj - 2a;ia;2 + a;| + 2x| = (x^-Xi)^ + 2x1 ~ ° Also, <M, u) — if and only if x-^ = 0, a!2 — 0, i.e. m = 0. Hence the last axiom [I^] is satisfied. Method 2. We argue via matrices. That is, we can write {u, v) in matrix notation: 1 -l\/2/i^ {u,v) = utAv = (a;i,a;2)i , o m \-l 3/\2/2 and so [/i] holds. Since A is symmetric, [/2] holds. Thus we need only show that A is positive definite. Applying the elementary row operation iZg ~* ^1 + ^2 *iid then the corresponding ele- mentary column operation C2 ^ Ci + C2, we transform A into diagonal form ( j . Thus A is positive definite and [Ig] holds. ^ ' 13.3. Find the norm of 1; = (3, 4) G R'^ with respect to: (i) the usual inner product, (ii) the inner product in Problem 13.2. (i) ||i;||2 = {v,v) = <(3,4), (3,4)> = 9 + 16 = 25; hence l|vll = 5. (ii) ||i;||2 = {v,v} = ((3,4), (3,4)> = 9 - 12 - 12 + 48 = 33; hence \\v\\ = a/sI. 13.4. Normalize each of the following vectors in Euclidean space R^: (i)M = (2,l,-l), {n)v = {hh-i). (i) Note {u, u) is the sum of the squares of the entries of u; that is, (u, u) — 2^ + 12 + (—1)2 = 6. Hence divide m by ||m|| = V<m. w) = \/6 to obtain the required unit vector: ul\\u\\ = (2/V6, l/Ve, -1/a/6) tear" of fractions: 12i; = (6, 8 ;he required unit vector is 12v/||12i;|| = (6/\/l09, 8/a/109. -3/\/i09 ) (ii) First multiply v by 12 to "clear" of fractions: 12i; = (6, 8, -3). We have <12v, 12t)) 62 + 82 + (—3)2 = 109. Then the required unit vector is 13.5. Let V be the vector space of polynomials with inner product given by {f,g) = Cf{t)g{t)dt. Let f{t) = t + 2 and g{t) = t^-2t-Z. Find (i) (f,g) and (ii) ||/||. (i) {f,9) = r (t + 2W-2t-S)dt = t*/4 - 7t2/2 - 6tT = -37/4 (ii) </,/> = r (t + 2)(t + 2)dt = 19/3 and ||/i| = y/Wj) = VW3 •'n CHAP. 13] INNER PRODUCT SPACES 293 13.6. Prove Theorem 13.1 (Cauchy-Schwarz): \{u,v}\ ^ \\u\\ \\v\\. If V = 0, the inequality reduces to — an d hen ce is valid. Now suppose v # 0. Using zz = |z|2 (for any complex number z) and {v,u) = {u,v}, we expand ||m — <m, v)tv||2 ^ where t is any real value: — ||m — <M, i;>ti;||2 = (u — {u,v)tv, u — (u,v)tv) = (u, u) — {u, v)t(u, v) — (u, v)t(v, u) + (u, V)(U, V)t^V, V) = ||m|P - 2t\{u,v)\^ + |(M,i;)|2t2||v||2 \(u v)\'^ Set t = l/\\v\\^ to find 0^ ||m|P- . ' , from which ](m,i>>P ^ ||m||2 H-yH^. Taking the square root of both sides, we obtain the required inequality. 13.7. Prove that the norm in an inner product space satisfies the following axioms: [2Vi]: \\v\\ ^ 0; and \\v\\ = if and only if v = 0. [N,\. ]H| = |fc||H|. [N,]: \\u + v\\^\\u\\ + \\v\\. By [/g], (v,v) — 0; hence \\v\\ = ^/l^v/v) — 0. Furthermore, ||i;|| = if and only if {v,v) = Q, and this holds if and only if v = Q. Thus [Wj] is valid. We find \\kv\\^ = {kv, kv) = kk(v,v) = |fc|2 ||-y||2. Taking the square root of both sides gives [ATg]. Using the Cauchy-Schwarz inequality, we obtain I |m + 0)112 = {u + v,u + v) — {u,u) + {u,v) + (u,v) + {v,v) ^ ||«|p + 2|M||HI + IMP = (IMI + IMIP Taking the square root of both sides yields [A^s]- Remark: [Ng] is frequently called the triangle inequality because , if we view m + -u as the side of the triangle formed with u and v (as illustrated on the right), then [Ng\ states that the length of one side of a triangle is less than or equal to the sum of the lengths of the other two sides. ORTHOGONALITY 13.8. Show that if u is orthogonal to v, then every scalar multiple of u is also orthogonal to V. Find a unit vector orthogonal to Vi = (1, 1, 2) and V2 = (0, 1, 3) in R^. If {u, V) = then {ku, v) = k{u, v) = fc • = 0, as required. Let w = {x, y, z). We want = {w,v{) = X + y + 2z and = (w, V2) = y + 3z Thus we obtain the homogeneous system X + y + 2z = 0, y + Sz = Set z = 1 to find y = —S and x = 1; then w = (1,-3,1). Normalize w to obtain the required unit vector w' orthogonal to v^ and ^2- W = w/||w|| = (l/yfTl, — 3/-\/ll, l/-v/li). 13.9. Let W be the subspace of R^ spanned by u = (1, 2, 3, -1, 2) and v = (2, 4, 7, 2, -1). Find a basis of the orthogonal complement W^ of W. We seek all vectors w = (x, y, z, s, t) such that {w,u) = X + 2y + Sz - s + 2t = (w,v) = 2x + 4:y + 7z + 2s - t = Eliminating x from the second equation, we find the equivalent system X + 2y + 3z - s + 2t = z + 4s ~ 5t = The free variables are y, s and t. Set y = —1, s = 0, t = to obtain the solution Wi = (2, —1, 0, 0, 0). Set y = 0, s-1, t = to find the solution Wj = (13,0,-4,1,0). Set y-Q, 8 = 0, t = l to obtain the solution W3 = (—17, 0, 5, 0, 1). The set {wj, W2. Ws) is a basis of W . 294 INNER PRODUCT SPACES [CHAP. 13 13.10. Find an orthonormal basis of the subspace W of C^ spanned by Vi — (1, i, 0) and t;2 = (l,2,l-i). Apply the Gram-Schmidt orthogonalization process. First normalize Vj. We find \M? = (.'"k'^i) = I'l + i'(-i) + 0-0 = 2 and so ||vi|| = V2 Thus Ml = vi/\\vi\\ = (l/V2,i/y/2,Q). To form W2 = Vj — (v^, Wi>Mi, first compute <i'2,wi> = <(1,2, l-«), (l/\/2,t/\/2, 0)> = l/V2-2i/V2 = (l-2i)/V2 Then ., = (i,2.1_^)--^^__,0j - ^_^,^,l-^ Next normalize W2 or, equivalently, 2w2 = (1 + 2i, 2 — i, 2 — 2i). We have ||2m)i||2 = (2m;i,2wi) = (1 + 2t)(l - 2i) + (2 - 1)(2 + 1) + (2 - 2i)(2 + 2i) = 18 and ||2wi|l = \/l8. Thus the required orthonormal basis of W is / 1 i \ 2wi /l + 2i 2-t 2 -2i ^2 yT^ r ' ii2»iii vig'^'vn 13.11. Prove Lemma 13.3: An orthonormal set {t*i, ...,«,} is linearly independent and, for any v e V, the vector yj = V — {V, Ui)Ui — {V, U^Ut — • ■ ■ — {V, Ur)Ur is orthogonal to each of the im. Suppose aiMi + • • • + a^u^ = 0. Taking the inner product of both sides with respect to Mi, = <0, Mi> = (aiMi + • ■ • + a^u^, Ml) = ai(Mi,Mi> + a2<M2,Mi> + ■•• + 0^<M„Mi> = Oi • 1 + a2 • + • • ■ + ftr ' = O-i or Oi = 0. Similarly, for i = 2, . . . , r, = (0, Mj) = <aiMi + • • • + a^M„ Mj) = ai<Mi, Mj> + • ■ • + Oi<Mi, Mj) + • • • + ar(Mr. «{> = "i Accordingly, {mi, . . . , Uj) is linearly independent. It remains to show that w is orthogonal to each of the Mj. Taking the inner product of w with respect to ttj, <«;,Ml> = <V,Ml> - <'y,Mi><Mi,Mi> - <t', M2KM2, Mi> - •■• - (V,M^>(M„Mi> = {V, Ml) - (V, Ml) • 1 - <V, M2) • - • • • - <V, M^> • = That is, w is orthogonal to Mi. Similarly, for i = 2, . . . ,r, <W,Mi) = <'U,Mi) - <'U,Ml)<Mi,Mi) - • • • - (■U,Mi)(Mi,Mj) - • • • - (•«, M^XMr, Mj) = Thus w is orthogonal to Mj for i = 1 r, as claimed. 13.12. Let TF be a subspace of an inner product space V. Show that there is an orthonormal basis of W which is part of an orthonormal basis of V. We choose a basis {v^, ...yV^ioiW and extend it to a basis {vj, ...,v^} of V. We then apply the Gram-Schmidt orthogonalization process to {vi,...,v^} to obtain an orthonormal basis {Ml M„} of y where, for i = 1, . . . , w, M; = ai^v^ + •■■ + au^i. Thus 'u^,...,Ur^W and there- fore {mi, . . . , mJ is an orthonormal basis of W. CHAP. 13] INNER PRODUCT SPACES 295 13.13. Prove Theorem 13.2: Let W' be a subspace of V; then V=W@W-^. By Problem 13.12 there exists an orthonormal basis {ui, . . ., u^} of W which is part of an ortho- normal basis {mi, . . .,m„} of V. Since {mj, . . .,«„} is orthonormal, u^+i, ...,«„£ TF"*-. If v e.V, V = OjMj + • • ■ + ft„M„ where ajV-i + ■ ■ • + a^u^ G W, (ir+i«*r + i + ' " • + «««« ^ ^ Accordingly, y = W + W"*-. On the other hand, if wGWnW-^, then <w,w> = 0. This yields w = 0; hence WnW''- = {0}. The two conditions, V =W+W^ and PTn W'"'- = {0}, give the desired result V =W ®W^. Note that we have proved the theorem only for the case that V has finite dimension; we remark that the theorem also holds for spaces of arbitrary dimension. 13.14. Let W' be a subspace of W. Show that WcW^^-^, and that W = W^-^ when V has finite dimension. Let weW. Then {w,v) = for every vGW^; hence wSTV^-^. Accordingly, WcW-^-^. Now suppose V has finite dimension. By Theorem 13.2, V — W ® W'^ and, also, V = W^ ®W-^-^ . Hence dim W = dim y - dim W*" and dim TF"^ "^ = dim y - dim W"^ This yields dim TF = dimiy-'--'-. But WcW'^-'- by the above; hence W = W-^-^, as required. 13.15. Let {ei, ...,€„} be an orthonormal basis of V. Prove: (i) for any uGV, u = {u, ei)ei 4- (u, 62)62 +•••+(«, en>e„; (ii) (ttifii + • • • + OnBu, biCi + • • • + 6„e„) = aibi + chbl + • • • + Onhn; (iii) for any u,v GV, (u, v) = {u, ei){v, ei) + • • • + (u, e„)<v, e„>; (iv) if T-.V^V is linear, then (r(ej), ei) is the i/-entry of the matrix A representing T in the given basis {d}. (i) Suppose at = /f 1 «! + fcj 62 + • • ■ + fen^n- Taking the inner product of u with ej, = fci<ei, ei> + fc2<e2, ej) + • • • + fc„(e„, ej} = fcj • 1 + fcj • + • • • + fe„ • = fci Similarly, for i = 2, . . .,n, {u, gj) = (fciCi + h fcjej H h fc„e„, ej) = kiie^, ej> + • • • + fci(ei, 6;) + • • • + fc„<e„, gj) = fci • + ■ • • + fci • 1 + • • • + fc„ • = fej Substituting <m, ej) for fej in the equation m = fcie, + • • • + fc„e„, we obtain the desired result. (ii) We have ( 2 ajee, 2 6jeA = 2 ai6;<ei, e^} \i=l i=l / i.i = l But (ej, e^) = for i 7^ j, and (Cj, e^) = 1 for i = j; hence, as required. 2 ajej, 2 fcjBj ) = 2 O'^i = aibi + ajbz + • • • + a„6„ i=l j = l / 1=1 (iii) By (i), u = (m, e,>ei +•••+(«, e„>e„ and i^ = {v, ei>ei + • • • + (i), e„>e„ Then by (ii), {u, v) = (m, e^Xv, e,) + (u, CzKv, 62) + ■ ■ ■ + {u, e„){v, e„> 296 INNER PRODUCT SPACES (iv) By(i), r(e,) = <r(ei), ei>ei + {T(e,), e^je^ + ■■■ + {T{e,), e„>e„ [CHAP. 13 T{e„) = iT(e„), e,)ei + {T(e„), 6^)6^ + • • • + (T{e^), e„)e„ The matrix A representing T in the basis {e;} is the transpose of the above matrix of efficients; hence the v-entry of A is (T(ej), ej. ADJOINTS 13.16. Let T be the linear operator on C^ defined by T{x, y, z) = (2x + (1 - i)y, (3 + 2i)x - 4iz, 2ix + (4 - Zi)y - Zz) ¥mdiT*{x,y,z). First find the matrix A representing T in the usual basis of C^ (see Problem 7.3): / 2 X-i A = 3 + 2i -4i \ 2i 4 - 3i -3 Form the conjugate transpose A* of A: / 2 3 - 2i -2i A* = 1 + i 4 + 3t \ 4i -3 Thus T*(x, y, z) = {2x + (3 - 2i)y - 2iz, (1 + i)x + (4 + 3i)z, Aiy - 3z) 13.17. Prove Theorem 13.5: Let ^ be a linear functional on a finite dimensional inner product space V. Then there exists a unique % G F such that ^(v) = {v, u) for every v G.V. Let {ei, . . . , e„} be an orthonormal basis of V. Set u = 0(ei)ei + 0(62)62 + • • • + 0(e„)e„ Let M be the linear functional on V defined by u(v) = (v, u), for every v ElV. Then for i = 1, . . . , to, w(ei) = (ej.M) = ^ej, 0(e7)ei + • • • + 0(ije„> = <p(e,) Since m and agree on each basis vector, u = <i>. Now suppose m' is another vector in V for which <i,(v) = (v, u') for every vGV. Then (V, u) = (V, u') or (V, u-u') = 0. In particular this is true for v = u-u' and so (u ~u',u- u') = 0. This yields u — u' = and u = u'. Thus such a vector m is unique as claimed. 13.18. Prove Theorem 13.6: Let T be a linear operator on a finite dimensional inner product space V. Then there exists a unique linear operator T* on V such that {T{u), v) = {u, T* (v)), for every u,v GV. Moreover, if A is the matrix representing T in an orthonormal basis {ei} of V, then the conjugate transpose A* of A is the matrix rep- resenting T* in {Ci}. We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map u h» (T(u), V) is a linear functional on V. Hence by Theorem 13.5 there exists a unique element v'&V such that {T{u),v) = {u,v'} for every u&V. We define T*V-^V by T*(v) = v'. Then (T(u), v) = {u, T* (v)) for every u.v&V. CHAP. 13] INNER PRODUCT SPACES 297 We next show that T* is linear. For any u, V; G V, and any a,b G K, (u, T*(av^ + hv^)) = {T(u), av^ + bvz) = d{T(u), v^) + b{T{u), v^) = a(u, r*(vi)) + b{u, T*(V2)} = (u, aT*(Vi) + bT*{v2)) But this is true for every uGV; hence T* {av^ + bv^) = aT*(vi) + bT*(v2). Thus T* is linear. By Problem 13.15(iv), the matrices A = (ay) and B = (6y) representing T and T* respectively in the basis {ej are given by Oy = (r(ej), e;) and by = <r*(ej), e^). Hence 6y = <r*(e,.), ej) = {ei, r*(e^)> = {T{ei), e,> = o^i Thus B = A*, as claimed. 13.19. Prove Theorem 13.7: Let S and T be linear operators on a finite dimensional inner product space V and let k G K. Then: (i) (5 + 7)* = 5*4-2^* (iii) (ST)* = T*S* (ii) (/cT)* = kT* (iv) (T*)* = T (i) For any u,v G V, {{S + T){u), V) = (S(m) + T{u), V) = {S{u), V) + {T(u), v) = (u, S*{v)> + (u, T*{v)) = {u, S*{v) + T*(v)) = (u, (S* + T*){v)} The uniqueness of the adjoint implies (S + T)* = S* + T*. (ii) For any u,v G V, {(kT){u), V) = (kT{u), V) = k(T(u), v) = k{u, T*{v)) = (u, kT*iv)) = {u, (kT*)(v)) The uniqueness of the adjoint implies {kT)* = kT*. (iii) For any u,v G V, {(ST)(u),v) = {S(T{u)),v} = {T(u),S*{v)) ^ {u, T*(S*(v))) = (m, (r*S*)(i;)> The uniqueness of the adjoint implies (ST)* = T*S*. (iv) For any u,vGV, {T*(u),v) = {v, T*{u)) = {T(v),u) = {u, T(v)) The uniqueness of the adjoint implies (T*)* = T. 13.20. Show that: (i) /* = 7; (ii) 0* = 0; (iii) if T is invertible, then (T-i)* = T*-\ (1) For every u,vGV, {I(u),v} = {u,v} = {u,I{v)}; hence I* = I. (ii) For every u,vGV, <0(M),'y) = {0,v) = = (u,0) - {u,0(v)); hence 0* = 0. (iii) 7 = /* = (TT-i)* = (r-i)*r*; hence {T'^)* = T*-K 13.21. Let r be a linear operator on V, and let W he a T-invariant subspace of V. Show that W is invariant under T*. Let uGW^. If wGW, then T{w) G W and so (w, T*(u)) = (Tiw), u) - 0. Thus T*(ii) GW^ since it is orthogonal to every w GW . Hence W is invariant under T*. 13.22. Let r be a linear operator on V. Show that each of the following conditions implies r = 0: (i) (r(M), i;) = for every u,v GY; (ii) F is a complex space, and (r(M),«) = for every m G F; (iii) T is self -adjoint and (r(tt),%) = for every « G F. Give an example of an operator T on a real space V for which (r(w), m> = for every m e F but T ^ 0. (i) Set V - T{u). Then {T{u), T{u)} = and hence T(u) = 0, for every uGV. Accordingly, r = o. 298 INNER PRODUCT SPACES [CHAP. 13 (ii) By hypothesis, (T(v + w), v + w) = for any v.wGiV. Expanding and setting {T{v),v} = and (T{w), w) = 0, {T(v), w) + (T(w), V) = Q (1) Note w is arbitrary in (1). Substituting iw for w, and using {T(v), iw) = i(T{,v), w) = -i(T(v),w) and {T{iw),v) - {iT(w),v) = i{T(w),v), -i{T(v), w) + i{T(w), V) = a Dividing through by i and adding to (1), we obtain {T(w), v) = Q for any v.wGV. By (1) T -Q. (iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding {T(v + w),v-\-w) = 0, we again obtain (1). Since T is self-adjoint and since it is a real space, we have (T(w),v) = (w,T(v)) = {T(v),w). Substituting this into (1), we obtain {T(v), w) = Q for any v,w&V. By (i), T - Q. For our example, consider the linear operator T on R2 defined by T(x, y) = (y, ~x). Then {T{u), u) = for every m G V, but T ¥^ 0. ORTHOGONAL AND UNITARY OPERATORS AND MATRICES 13,23. Prove Theorem 13.9: The following conditions on an operator U are equivalent: (i)C/* = C/-i; (ii) {Uiv),Uiw)} = {v,w}, for every v.wGV; (iii) ||C7(v)|| = ||t;||, for every v &V. Suppose (i) holds. Then, for every v,w &V, {Uiv),U(w)) = {v,U*U{w)) = {v,I(w)) - {v,vo) Thus (i) implies (ii). Now if (ii) holds, then ||C7(i;)|| = V<t/(v), V(.v)) = V(^> = \\v\\ Hence (ii) implies (iii). It remains to show that (iii) implies (i). Suppose (iii) holds. Then for every i; S V, (U*U{v), V) = {U{v), V(v)) = {V, v) = {I{v), V) Hence ((U*V - I)(v),v) = for every ve.V. But t7*t7 -/ is self -adjoint (Prove!); then by Prob- lem 13.22 we have V*U-I-Q and so U*U = I. Thus 17* = t7-i as claimed. 13.24. Let C7 be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U. Show that W^ is also invariant under U. Since U is nonsingular, U{W) — W; that is, for any w G W there exists w' G TV such that U{w') — w. Now let V G W^. Then for any w &W, (U(y), w) = (U(v), V(w')) - {V, w') = Thus U{v) belongs to W . Therefore W is invariant under U. 13.25. Let A be a matrix with rows Ri and columns d. Show that: (i) the i;-entry of A A* is {Ri, Rj); (ii) the y-entry of A* A is {d, d). If A = (ftjj), then A* = (6y) where 6y = a^. Thus AA* = (cy) where n n fc— 1 K — 1 = <(«(!, . . . , ai„), {aji, ..., UjJ) = (Ri, Rj) as required. Also, A*A = (dy) where n n <iii = 2 &ifc«icj = 2 «kj»fci = o-iflii + »2j^ -!-•■■+ a„M^ k = l fc = l = ((«ij, . . . , a^j), (an, . . . , a„i)) = {Cj, C-) CHAP. 13] INNER PRODUCT SPACES 299 13.26. Prove Theorem 13.11: The following conditions for a matrix A are equivalent: (i) A is unitary (orthogonal), (ii) The rows of A form an orthonormal set. (iii) The columns of A form an orthonormal set. Let fij and Cj denote the rows and columns of A, respectively. By the preceding problem, AA* = (cy) where Cy = (Ri,«j>. Thus AA* = I if and only if <fli,fij) = Sy. That is, (i) is equivalent to (ii). Also, by the preceding problem. A* A = (dy) where dy - (Cj, Q. Thus A* A = / if and only if (Cj, Cj) = 8y. That is, (i) is equivalent to (iii). Remark: Since (ii) and (iii) are equivalent, A is unitary (orthogonal) if and only if the transpose of A is unitary (orthogonal). 13.27. Find an orthogonal matrix A whose first row is Ui = (1/3, 2/3, 2/3). First find a nonzero vector W2 = {x, y, z) which is orthogonal to Mj, i.e. for which = (Ml, W2> = x/3 + 2y/B + 2z/3 = or x + 2y + 2z = One such solution is Wg — (0, 1, —1). Normalize W2 to obtain the second row of A, i.e. M2=(0,l/vi,-l/\/2). Next find a nonzero vector Wg — {x, y, z) which is orthogonal to both Mj and «£, i.e. for which = <Mi, W3) = xlZ + 2ylZ + 2«/3 =0 or x -\- 2y + 2z = Q — (U2, Wz) — Vlyf2 — zlyl2 = or 2/— z = Set « = — 1 and find the solution Wj = (4, —1, —1). Normalize W3 and obtain the third row of A, i.e. M3 = (4/Vl8, -l/VlS, -l/\/l8). Thus / 1/3 2/3 2/3 A = 1/a/2 -1/\/2 \4/3\/2 -l/3\/2 -l/3\/2y We emphasize that the above matrix A is not unique. 13.28. Prove Theorem 13.12: Let {ei, . . . , e„} be an orthonormal basis of an inner product space V. Then the transition matrix from {Ci} into another orthonormal basis is unitary (orthogonal). Conversely, if P = (ao) is a unitary (orthogonal) matrix, then the following is an orthonormal basis: {e'i = auBi + 02162 + • • • + a„ie„ : i = 1, . . . , w} Suppose {/j} is another orthonormal basis and suppose /i = ^ilBl + 61262 +•••+ fein^n. 1=1, ...,n U) By Problem 13.15 and since {/j} is orthonormal, Sy = (fufj) = biibfi + 6426^ + • • • + bi„i;~ (2) Let B = (6y) be the matrix of coefiicients in (1). (Then B* is the transition matrix from {ej} to {/J.) By Problem 13.25, BB* = (cy) where Cy = 6ii67i + ^12^ + • • • + K^n- By (2). "n = ^H and therefore BB* = /. Accordingly B, and hence B*, are unitary. It remains to prove that {«,'} is orthonormal. By Problem 13.15, (ei e'j) = auo^ + a^^j + • • • + a„ia;;j = (Cj, Cj) where Cj denotes the ith column of the unitary (orthogonal) matrix P = (ay). By Theorem 13.11, the columns of P are orthonormal; hence (e[, ej) = <C{, Cj) = 8y. Thus {e[} is an orthonormal basis. 13.29. Suppose A is orthogonal. Show that det(A) = 1 or —1. Since A is orthogonal, A A* = /. Using \A\ = \A*\, 1 = 1/| = lAAt| = \A\\At\ = |Ap Therefore lAI = 1 or — 1. 300 INNER PRODUCT SPACES [CHAP. 13 13.30. Show that every 2 by 2 orthogonal matrix A for which det{A) = 1 is of the form /cos 6 ~ sin 9\ ^ , , . „ „ for some real number 6. y sm 6 cos 9 j /a b\ Suppose A = { ) . Since A is orthogonal, its rows form an orthonormal set; hence ^c d J „2 + 62 = x^ c2 + d2 = 1, ac+hd = 0, ad - be = 1 The last equation follows from det(A) = 1. We consider separately the cases a. = and a¥'0. If a = 0, the first equation gives 6^ = 1 and therefore b = ±1. Then the fourth equation gives c = —b = ^:l, and the second equation yields 1 + d^ = i or d = 0. Thus ^ = (-: I) " c I The first alternate has the required form with e — — r/2, and the second alternate has the required form with e = ttI2. If a 7^ 0, the third equation can be solved to give c =^ —bd/a. Substituting this into the second equation, 62d2/a2 + (£2 =: 1 or bU^ + a2d2 = a2 or (62 + aP')d2 = a^ or a2 = d2 and therefore a = d or a — —d. If a = —d, then the third equation yields c = b and so the fourth equation gives —a^ — c2 = 1 which is impossible. Thus a = d. But then the third equa- tion gives b — —c and so /a — c^ A = Since a^ + c^ = 1, there is a real number 9 such that a = cos e, c — sin « and hence A has the required form in this case also. SYMMETRIC OPERATORS AND CANONICAL FORMS IN EUCLIDEAN SPACES 13.31. Let r be a symmetric operator. Show that: (i) the characteristic polynomial A{t) of r is a product of linear polynomials (over R); (ii) T has a nonzero eigenvector; (iii) eigenvectors of T belonging to distinct eigenvalues are orthogonal. (i) Let A be a matrix representing T relative to an orthonormal basis of V; then A — A*. Let A(t) be the characteristic polynomial of A. Viewing A as a complex self -adjoint operator, A has only real eigenvalues by Theorem 13.8. Thus A(«) = (i-Xi)(t-X2)---(«-X„) where the Xj are all real. In other words, A(t) is a product of linear polynomials over B. (ii) By (i), T has at least one (real) eigenvalue. Hence T has a nonzero eigenvector. (iii) Suppose T(v) = \v and T(w) = nw where \ ¥= /i. We show that X('U, w) = ii{v, w): \{v,w} = {\v,w} = {T{v),w) = {v,T{w)) = {v,nw) = ti{v,w) But \¥' n\ hence (v, w) = as claimed. 13.32. Prove Theorem 18.14: Let T be a symmetric operator on a real inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an orthonormal basis. The proof is by induction on the dimension of V. If dimV = 1, the theorem trivially holds. Now suppose dim F = n > 1. By the preceding problem, there exists a nonzero eigenvector v^ of T. Let W be the space spanned by v-^, and let Mj be a unit vector in W, e.g. let Mj = i'i/||vi||. CHAP. 13] INNER PRODUCT SPACES 301 Since v^ is an eigenvector of T, the subspace TF of y is invariant under T. By Problem 13.21, W^ is invariant under T* = T. Thus the restriction T of T to W^ is a symmetric operator. By Theorem 13.2, V =W ®W^. Hence dim TF"*- = m - 1 since dim W^ = 1. By induction, there exists an orthonormal basis {u^, . . . , m„} of W^ consisting of eigenvectors of T and hence of T. But <Mi, Wj> = for t = 2, . . . , m because Mj G PF-*- . Accordingly {%, %,...,«„} is an orthonormal set and consists of eigenvectors of T. Thus the theorem is proved. 13^3. Let A = ( 2 ^ ) . Find a (real) orthogonal matrix P for which P^AP is diagonal. The characteristic polynomial A(t) of A is A(t) = |f/-A| = t- 1 -2 -2 t- 1 = {2 - 2t - 3 = (t - 3)(t + 1) and thus the eigenvalues of A are 3 and —1. Substitute t = 3 into the matrix tl — A to obtain the corresponding homogeneous system of linear equations 2x-2y - 0, -2x + 2j/ = A nonzero solution is Vi — (1, 1). Normalize v^ to find the unit solution Mi = (ll\2, l/v2). Next substitute t — —l into the matrix tl — A to obtain the corresponding homogeneous system of linear equations -2x - 2j/ = 0, -2a; - 2^/ = A nonzero solution is v^ = (1, —1). Normalize V2 to find the unit solution u^ = (1/a/2, —1/^/2). Finally let P be the matrix whose columns are Mj and Mj respectively; then As expected, the diagonal entries of P*AP are the eigenvalues of A. 13.34. Let A — \1 2 1 . Find a (real) orthogonal matrix P for which P'^AP is diagonal. \l 1 2/ First find the characteristic polynomial A{t) of A: t - 2 -1 -1 -1 t - 2 -1 -1 -1 t - 2 A(t) = \tI-A\ = = (t-l)2(t-4) Thus the eigenvalues of A are 1 (with multiplicity two) and 4 (with multiplicity one). Substitute t = 1 into the matrix tl — A to obtain the corresponding homogeneous system —X — J/ — 2 = 0, —X — y — z = 0, —X — y — z = That is, X + y + z = 0. The system has two independent solutions. One such solution is Vi = (1, —1, 0). We seek a second solution V2 = (a, 6, c) which is also orthogonal to v^; that is, such that a+ b + c = and also a — 6 = For example, Vj = (1, 1, —2). Next we normalize f j and V2 to obtain the unit orthogonal solutions Ml = (l/\/2, -I/V2, 0), Ma = (l/\/6, l/\/6, -2/V^) Now substitute t = 4 into the matrix tl — A to find the corresponding homogeneous system 2x — y — z = 0, -X + 2y - z = 0, -x - y + 2z = Find a nonzero solution such as t^s = (1, 1, 1), and normalize v^ to obtain the unit solution M3 = (l/v3> l/v3, l/vS). Finally, if P is the matrix whose columns are the Wj respectively. 302 INNER PRODUCT SPACES [CHAP. 13 i/\/2 i/Ve i/VsX P = I -I/V2 l/Ve l/Vs and PtAP -2/\/6 I/V3/ 13.35. Find an orthogonal change of coordinates which diagonalizes the real quadratic form q(x, y) = 2x^ + 2xy + 2y^. First find the symmetric matrix A representing q and then its characteristic polynomial A(t): '2 1- A = 1 2 and A{t) = 1*7- A| = t- 2 -1 -1 t - 2 {t-l){t-3) The eigenvalues of A are 1 and 3; hence the diagonal form of q is q(x', y') = x'^ + Zy'^ We find the corresponding transformation of coordinates by obtaining a corresponding orthonormal set of eigenvectors of A. Set f = 1 into the matrix tl — A to obtain the corresponding homogeneous system —X — y = 0, —X — y — A nonzero solution is v^ = (1,-1). Now set i = 3 into the matrix tl — A to find the corresponding homogeneous system X — y — 0, —X + y = A nonzero solution is V2 = (1, 1). As expected by Problem 13.31, v^ and V2 are orthogonal. Normalize Vi and V2 to obtain the orthonormal basis {ui = (l/\/2, -l/\/2), M2 = (l/\/2, I/V2)} The transition matrix P and the required transformation of coordinates follow: P = l/^/2 l/\/2 -l/\/2 l/\/2 and = P (x' + y')/V2 (-x' + y')/^/2 Note that the columns of P are Mj and 1*2- We can also express x' and y' in terms of x and j/ by using P^i = P'; that is, x' = {x-y)/V2, y' = (« + j/)/\/2 13.36. Prove Theorem 13.15: Let T be an orthogonal operator on a real inner product space V. Then there is an orthonormal basis with respect to which T has the following form: 1 I -_j I -1 -1 -1 ^ 1 1 cos Oi — sm di I I sin di cos 01 I cos 9r — sm 9r I sin 9r cos 6r CHAP. 13] INNER PRODUCT SPACES 303 Let S = r + r-i = T + T*. Then S* = (T + T*)* = T* + T = S. Thus S Is a symmetric operator on V. By Theorem 13.14, there exists an orthonormal basis of V consisting of eigenvectors of S. If Xi, . . . , Xjn denote the distinct eigenvalues of S, then V can be decomposed into the direct sum y = Vi Va © • • • © Vm where the Vj consists of the eigenvectors of S belonging to Xj. We claim that each Vj is invariant under T. For suppose v e Vj; then S{v) — \v and S(T(v)) = (T+T-^)T(v) = T(T+T-^){v) = TS{v) = TiXfV) = \iT(v) That is, T{v) & Fj. Hence Vi is invariant under T. Since the V; are orthogonal to each other, we can restrict our investigation to the way that T acts on each individual 'V^. On a given V;, (T + T-^)v = S(v) = \v. Multiplying by T, (T2-\T + I){v) = We consider the cases Xj = ±2 and X; ¥= ±2 separately. If Xj = ±2, then (T ± I)Hv) - which leads to (T ± I){v) = or T(v) = ±v. Thus T restricted to this Fj is either I or -/. If Xj ¥= ±2, then T has no eigenvectors in Vj since by Theorem 13.8 the only eigenvalues of T are 1 or —1. Accordingly, for v ¥= the vectors v and T{v) are linearly independent. Let W be the subspace spanned by v and T(v). Then W is invariant under T, since T(T(v)) = T^v) = \^T(v) - v By Theorem 13.2, Vj = W © W-^ . Furthermore, by Problem 13.24 w'' is also invariant under T. Thus we can decompose V^ into the direct sum of two dimensional subspaces Wj where the Wj are orthogonal to each other and each Wj is invariant under T. Thus we can now restrict our investiga- tion to the way T acts on each individual Wj. Since T^ — XjT + / = 0, the characteristic polynomial A(t) of T acting on Wj is A(t) = t^ — \t + 1. Thus the determinant of T is 1, the constant term in A(t). By Problem 13.30, the matrix A representing T acting on Wj relative to any orthonormal basis of Wj must be of the form 'cos e — sin e^ ^ sin e cos e y The union of the basis of the Wj gives an orthonormal basis of Vj, and the union of the basis of the Vj gives an orthonormal basis of V in which the matrix representing T is of the desired form. NORMAL OPERATORS AND CANONICAL FORMS IN UNITARY SPACES 13.37. Determine which matrix is normal: (i) A = / ^ *■ ) , (ii) B = I , 2 + • <» - = (;:)(-;:) = (-:;) -- g OG o = (-' ^ Since AA* ¥= A*A, the matrix A is not normal. ^"' 1^1 2 + iJ\-i 2-iJ [2-2i 6 \-i 2-iJ\l 2 + iJ \2-2i 6 Since BB* = B*B, the matrix B is normal. 13.38. Let T be a normal operator. Prove: (i) Tiv) = if and only if T*{v) = 0. (ii) T — \I is normal. (iii) If T{v) = \v, then T*{v) = Xv; hence any eigenvector of T is also an eigen- vector of T*. (iv) If T{v) = Xiv and T{w) = X2W where A,i ^ A2, then {v,w) = 0; that is, eigen- vectors of T belonging to distinct eigenvalues are orthonormal. 304 INNER PRODUCT SPACES [CHAP. 13 (i) We show that {T(v), T{v)) = {T*(v), T*iv)): (T(v), T(v)) = (V, T*T{v)) = {V, TTHv)) = {T-{v), T'-(v)) Hence by [/g], T(v) = if and only if T*(v) = 0. (ii) We show that T — \I commutes with its adjoint: {T - \i)(,T - \i)* = (r-x7)(r*-x/) = rr* - XT* - xr + xx/ _ T*T -XT - XT* + XXI = {T* -Xl){T - XI) = (T - XI)*{T - XI) Thus T ~\I is normal. (iii) If T(v) = XV, then (T - Xl){v) = 0. Now T - Xl is normal by (ii); therefore, by (i), (r-X/)*(i>) = 0. That is, [T* -Xl)(v) = 0; hence T*(v) -Xv. (iv) We show that Xi{v, w) = X2<v, w): Xi{v,w} = (Xiv.w) = {T(v),w) = {v,T*(w)) - {v.X^w) = X<i_{v,w) But Xj ¥= X2; hence {v, w) = 0. 13.39. Prove Theorem 13.16: Let T be a normal operator on a complex finite dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an orthonormal basis. The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now suppose dim V — n> \. Since V is a complex vector space, T has at least one eigen- value and hence a nonzero eigenvector v. Let W be the subspace of V spanned by v and let u^ be a unit vector in W. Since v is an eigenvector of T, the subspace W is invariant under T. However, v is also an eigenvector of T* by the preceding problem; hence W is also invariant under T*. By Problem 13.21, W is invariant under T** = T. The remainder of the proof is identical with the latter part of the proof of Theorem 13.14 (Problem 13.32). 13.40. Prove Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional inner product space V. Then T can be represented by a triangular matrix relative to an orthonormal basis {Ui, U2, . . ., Wn} ; that is, for i = l, . . .,n, T{ui) — OiiMi + ai2U2 + • • • + aiiUi The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now suppose dim V = n > 1. Since V is a complex vector space, T has at least one eigen- value and hence at least one nonzero eigenvector v. Let W be the subspace of V spanned by v and let Ml be a unit vector in W. Then itj is an eigenvector of T and, say, T{ui) = a^Ui. By Theorem 13.2, V =W ®W^. Let E denote the orthogonal projection of V into W''' . Clearly W is invariant under the operator ET. By induction, there exists an orthonormal basis {^2, . . . , M„} of W such that, for i = 2, . . .,n, ET{Ui) = OJ2M2 + ttiaMg + • • ■ + djiitj (Note that {ui,U2, ...,u„} is an orthonormal basis of V.) But E is the orthogonal projection of V onto W ; hence we must have T(Ui) = ajiMi + ai2M2 + • ■ • + fliiM; for i = 2,...,n. This with T(ui) = a^Ui gives us the desired result. CHAP. 13] INNER PRODUCT SPACES 305 MISCELLANEOUS PROBLEMS 13.41. Prove Theorem 13.13A: The following conditions on an operator P are equivalent: (i) P = T^ for some self -ad joint operator T. (ii) P — S*S for some operator S. (iii) P is self -adjoint and {P(u), u}^ for every u GV. Suppose (i) holds, that is, P = T^ where T = T*. Then P = TT = T*T and so (i) implies (ii). Now suppose (ii) holds. Then P* = (S*S)* = S*S** = S*S = P and so P is self-adjoint. Furthermore, {P(u),u} = {S*S(u),u) = (S(m), S(m)> ^ Thus (ii) implies (iii), and so it remains to prove that (iii) implies (i). Now suppose (iii) holds. Since P is self -adjoint, there exists an orthonormal basis {u^, . . ., m„} of V consisting of eigenvectors of P; say, P^.u^) = XiMj. By Theorem 13.8, the Xj are real. Using (iii), we show that the Xj are nonnegative. We have, for each i, - <P(Mi), Mj) = (XjMj, Mi) = \(Ui, Mi> Thus <Mi, Mj) - forces Xj - 0, as claimed. Accordingly, ^/\^ is a real number. Let T be the linear operator defined by r(Mj) = VXiMi, for t=l, ...,n Since T is represented by a real diagonal matrix relative to the orthonormal basis {u^, T is self- adjoint. Moreover, for each i, T^Ui) = r(vTiMi) = V\iT{ut) = V^^^/\^u^ = \{u.i = P(ud Since T^ and P agree on a basis of y, P = T^. Thus the theorem is proved. Remark: The above operator T is the unique positive operator such that P = T^ (Problem 13.93); it is called the positive square root of P. 13.42. Show that any operator T is the sum of a self-adjoint operator and skew-adjoint operator. Set S^^{T+T*) and U = ^(T-T*). Then T = S+U where S* = {^{T+T*))* := :^(T* + T**) = 1{T* + T) = S and U* = {^(T-T*))* =; ^(T* - T) = -^(T - T*) = -U i.e. S is self-adjoint and U is skew adjoint. 13.43. Prove: Let T be an arbitrary linear operator on a finite dimensional inner product space V. Then T is a product of a unitary (orthogonal) operator U and a unique positive operator P, that is, T = UP. Furthermore, if T is invertible, then U is also uniquely determined. By Theorem 13.13, r*r is a positive operator and hence there exists a (unique) positive operator P such that P2 = T*T (Problem 13.93). Observe that ||P(t;)||2 = {P{v),P(v)} = (PHv),v} = {T*T(v),v} = {T(v), T(v)) = \\T{v)\\2 (i) We now consider separately the cases when T is invertible and non-invertible. If T is invertible, then we set U = Pr-i. We show that U is unitary: U* - (Pr-i)* - T-i*P* = (r*)-ip and U*U = (T*)-^PPT-i = (T*)-^T*TT-^ = I Thus U is unitary. We next set U =U-K Then U is also unitary and T = UP as required. To prove uniqueness, we assume T = UqP^ where U^ is unitary and Pg is positive. Then T*T = PtutUaPo = PoIPo = ^o But the positive square root of T*T is unique (Problem 13.93); hence Pq = P. (Note that the invertibility of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also by (1). Multiplying U^P = UP on the right by P-i yields Uo = U. Thus U is also unique when T is invertible. 306 INNER PRODUCT SPACES [CHAP. 13 Now suppose T is not invertible. Let W be the image of P, i.e. W = ImP. We define Ui-.W-^V by Ui(w) - T(v) where P(v) = w (2) We must show that Ui is well defined, that is, that P(v) - P{v') implies T(v) = T(v'). This follows from the fact that P{v — v') = is equivalent to ||P(d — i;')li = which forces ||r(i) — v')l| = by (1). Thus U^ is well defined. We next define Ui-W^V. Note by (1) that P and T have the same kernels. Hence the images of P and T have the same dimension, i.e. dim (Im P) — dim W = dim(Imr). Consequently, W and (Im 7^-'- also have the same dimension. We let C/j be any isomorphism between W-^ and (Im T) . We next set U = UiQ U^. (Here U is defined as follows: it v GV and v — w + w' where weW.w'eW^, then U(v) = Ui(w) + U^iw').) Now U is linear (Problem 13.121) and, if v G V and P(v) — w, then by (2) T(v) = Ui(w) = U(w) = UPiv) Thus T = UP as required. It remains to show that U is unitary. Now every vector x G V can be written in the form X = P(v) + w' where w'GPT-^. Then U{x) = UPiv) + U2{w') = ^(^;) + ?72(w') where <7'(i;), UiiW)) = by definition of U^. Also, <r(i;), T{v)) - {P(v),P(v)) by (1). Thus <?7(a;), U(x)) = {T{v) + U^iW), T(v) + U^(w')) = {T(v), T(v)) + (Ui(w'), U^iw')) = {P(v), P{v)} + <w', w') = <P(i;) + w', P{v) + w') = (X, X) (We also used the fact that (P{v),w') = 0.) Thus U is unitary and the theorem is proved. 13.44. Let (tti, a2, . . . ) and (6i, &2, . . . ) be any pair of points in i2-space of Example 13.5. 00 Show that the sum ^(hhi = aibi + azbz + • • • converges absolutely. By Problem 1.16 (Cauchy-Schwarz inequality), K6,l + ••• + KM ^ A 2 «i J 2 6? - \2«fA S^-i which holds for every n. Thus the (monotonic) sequence of sums S„ = l^i^il + • • • + la„6„| is bounded, and therefore converges. Hence the infinite sum converges absolutely. 13.45. Let V be the vector space of polynomials over R with inner product defined by </, fl') = I f{t) g{t) dt. Give an example of a linear functional ^ on V for which Theorem 13.5 does not hold, i.e. there does not exist a polynomial h{t) for which <l>{f) = <f,h) for every feV. Let ^ : y -> R be defined by ^(/) = /(O), that is, ^ evaluates f(t) at and hence maps f{t) into its constant term. Suppose a polynomial h(t) exists for which -Pif) = m - C f{t)h(t)dt (1) for every polynomial f(t). Observe that ^ maps the polynomial tf(t) into 0; hence by (1), -1 tf(t)h(t)dt = (2) '0 for every polynomial f{t). In particular, (2) must hold for f(t) = th{t), that is. C.m^t)dt = This integral forces h(t) to be the zero polynomial; hence 4>(f) = (f.h) = </,0) = for every poly- nomial f(t). This contradicts the fact that <f> is not the zero functional; hence the polynomial h(t) does not exist. CHAP. 13] INNER PRODUCT SPACES 307 Supplementary Problems INNER PRODUCTS 13.46. Verify that (ajMj + a^u^, fei'Wi + h^^ = a^^ifi^, v{) + a.i62(Wi, v^) + a^-^{u^, v-^ + ^^^{u^, v^ More generally, prove that I m n \ \ i=X i=l / i,i 13.47. Let u = (xi, x^) and v = (j/j, y^) belong to R2. (i) Verify that the following is an inner product on R^: f(u, V) = XiVi - 2XiV2 - ZxzVi + 5x2y2 (ii) For what values of k is the following an inner product on R2? f(u, v) = XiVi - BxiVz - SXiVi + kx2y2 (iii) For what values of a, 6, c,d £ R is the following an inner product on R^? f{u, v) - a^ij/i + 6*12/2 + ca;23/i + da;2J/2 13.48. Find the norm of v = (1, 2) € R2 with respect to (i) the usual inner product, (ii) the inner product in Problem 13.47(i). 13.49. Let u = (zi, Z2) and v = (w^, W2) belong to C^. (i) Verify that the following is an inner product on C^: f{u, V) = ZiWi + (1 + i)ZiW2 + (1 — i)22'"'l + SZ2W2 (ii) For what values of a, b, c, d e C is the following an inner product on C^? f(u, v) = aziWi + bziW2 + CZ2W1 + dz2W2 13.50. Find the norm of v = (l — 2i,2 + Si) G C^ with respect to (i) the usual inner product, (ii) the inner product in Problem 13.49(i). 13.51. Show that the distance function d(u,v) = ||v — m||, where u,v&V, satisfies the following axiom of a metric space: [I>j] d{u,v) — 0; and d(u,v) = if and only if u = v. [D2] d(u, v) = d{v, u). [Dg] d{u, v) — d{u, w) + d{w, v). 13.52. Verify the Parallelogram Law: Hm + ijII + ||M-'y|| = 2||m|| + 2||d||. 13.53. Verify the following polar forms for (u, v): (i) {u,v) == l\\u + v\\2-l\\u-v\\2 (real case); (ii) (u,v} = J^||M + a'||2-|||M-i)||2+i||M + i'u||2-i||M-w||2 (complex case). 13.54. Let V be the vector space of m X ti matrices over R. Show that {A,B) = tr(B'A) defines an inner product in V. 13.55. Let V be the vector space of polynomials over R. Show that {f,g)= I f{t) g(t) dt defines an inner product in V. 13.56. Find the norm of each of the following vectors: (i) u = (|,-Jt,^,^)GR4, (ii) V = (1 - 2t, 3 + i, 2 - 5i) G C^, (iii) /(() = t2 _ 2t + 3 in the space of Problem 13.55, (iv) A — [ ) in the space of Problem 13.54. V3 -4/ 308 INNER PRODUCT SPACES [CHAP. 13 13.57. Show that: (i) the sum of two inner products is an inner product; (ii) a positive multiple of an inner product is an inner product. 13.58. Let a, 6, c e R be such that at^ + bt + c- for every t e R. Show that 62 _ 4^0 ^ 0. Use this result to prove the Cauchy-Schwarz inequality for real inner product spaces by expanding ||tM + i;||2 ^ 0. 13.59. Suppose |<m, •«>! = ||m|| H^yll. (That is, the Cauchy-Schwarz inequality reduces to an equality.) Show that u and v are linearly independent. 13.60. Find the cosine of the angle e between u and v if: (i) u = (1, -3, 2), V = (2, 1, 5) in RS; (ii) u — 2t — l, V = t^ in the space of Problem 13.55; /2 1\ /O -1\ (ill) M=(_l,v = ( ) in the space of Problem 13.54. ORTHOGONALITY 13.61. Find a basis of the subspace W of R* orthogonal to u^ = (1, —2,3,4) and % = (3. —5, 7, 8). 13.62. Find an orthonormal basis for the subspace W of C^ spanned by Wj = (1, i, 1) and % = (1 + i, 0, 2). 13.63. Let V be the vector space of polynomials over R of degree — 2 with inner product </, g) = Cf{t)g{t)dt. (i) Find a basis of the subspace W orthogonal to h{t) = 2t + 1. (ii) Apply the Gram-Schmidt orthogonalization process to the basis {1, t, i^} to obtain an ortho- normal basis {%(*), U2(t), u^(t)} of V. 13.64. Let y be the vector space of 2 X 2 matrices over R with inner product defined by {A,B) — tr(B*A). (i) Show that the following is an orthonormal basis of V: '1 0\ /O 1\ /O 0\ /O ON ,0 07' Vo o;- (,0 \)' Vo \, (ii) Find a basis for the orthogonal complement of (a) the diagonal matrices, (6) the symmetric matrices. 13.65. Let If be a subset (not necessarily subspace) of V. Prove: (i) W = •E'(PF); (ii) if V has finite dimension, then W — I'(W). (Here UyV) is the space spanned by W.) 13.66. Let W be the subspace spanned by a nonzero vector w in Y, and let E be the orthogonal projection {v, w) of V onto W. Prove B(d) = tj — t- w. We call E(v) the projection of v along w. \\w\\^ 13.67. Find the projection of v along w if: (i) V = (1, -1, 2), w = (0, 1, 1) in R3; (ii) V -(l-i,2 + 3i),w = (2~-i,S) in C2; (iii) V = 2t — l, w = t^ in the space of Problem 13.55; /I 2\ /O -1\ (iv) v = I q)''*'~(i 9)^" *^® space of Problem 13.54. 13.68. Suppose {mj, . . .,mJ is a basis of a subspace W of V where dim V = n. Let {vi, . . .,i;„_,} be an independent set of n — r vectors such that (Uj, Vj) = for each i and each j. Show that {■^1, . . .,a>„_r} is a basis of the orthogonal complement W . CHAP. 13] INNER PRODUCT SPACES 309 13.69. Suppose {mi, . ..,u^) is an orthonormal basis for a subspace W of V. Let E :V -^V be the linear mapping defined by „, , , , , , , E(V) = {V, Mi>Mi + (V, M2>M2 + ' ' • + {V, ll^U, Show that E Is the orthogonal projection of V onto W. r 13.70. Let {tti mJ be an orthonormal subset of V. Show that, for any v€.V, 2 K^'.Wi)!^ - ll^'ll^- (This Is known as Bessel's Inequality.) " 13.71. Let y be a real inner product space. Show that: (I) ||m|| = ll'i'll If and only If {u + v,u — v) = Q; (II) ||ti + 1^112 = ||m||2 + 11^112 If and only If (u,v) = 0. Show by counterexamples that the above statements are not true for, say, C^. 13.72. Let U and W be subspaces of a finite dimensional inner product space V. Show that: {i) (U+W) = U-^ nW-^; (11) (UnW)-^ = U-^ + W^. ADJOINT OPERATOR 13.73. Let r : R3 ^ R3 be defined by Tix, y, z) = (x + 2y, Zx - Az, y). Find T* (x, y, «). 13.74. Let r : C3 ^ 03 be defined by T(,x, y, z) = (ix + (2 + Zi)y, 3a! + (3 - i)z, (2 - hi)y + iz) Find T*{x,y,z). 13.75. For each of the following linear functions ^ on F find a vector uG.V such that <i>(v) = {v, u) for every v G V: (i) (ii) (iii) ^ R3 -> R defined by <i>{x, y,z) — x + 2y — 32. C3 -» C defined by <f>{x, y, z) - ix + (2 + St)y + (1 - 2i)z. y -» R defined by <p(f) — /(I) where V is the vector space of Problem 13.63. 13.76. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the kernel of T, i.e. Im T* = (Ker T)-^ . Hence rank(r) = rank(r*). 13.77. Show that T*T = implies 7 = 0. 13.78. Let V be the vector space of polynomials over R with inner product defined by (/, ff> = I f(t) g(t) dt. Let D be the derivative operator on V, i.e. D{f) = df/dt. Show that there is no operator D* on V such that (D(f),g) = {f,D*(g)) for every f,g &V. That is, D has no adjoint. UNITARY AND ORTHOGONAL OPERATORS AND MATRICES 13.79. Find an orthogonal matrix whose first row is: (1) (l/VS, 2/v'5); (il) a multiple of (1,1,1). 13.80. Find a symmetric orthogonal matrix whose first row is (1/3,2/3,2/3). (Compare with Problem 13.27.) 13.81. Find a unitary matrix whose first row is: (1) a multiple of (1, 1 — t); (ii) (\, \i, ^ — Ji) 13.82. Prove: The product and inverses of orthogonal matrices are orthogonal. (Thus the orthogonal matrices form a group under multiplication called the orthogonal group.) 13.83. Prove: The product and Inverses of unitary matrices are unitary. (Thus the unitary matrices form a group under multiplication called the unitary group.) 13.84. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal. 310 INNER PRODUCT SPACES [CHAP. 13 13.85. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix P such that B = P*AP. Show that this relation is an equivalence relation. 13.86. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P such that B — P*AP. Show that this relation is an equivalence relation. 13.87. Let W be a subspace of V. For any v&V let v = w + w' where wGWyW'e W^ . (Such a sum is unique because V=W®W^.) Let T-.V^V be defined by T(v) =w-w'. Show that T is a self-adjoint unitary operator on V. 13.88. Let V be an inner product space, and suppose U : V -* V (not necessarily linear) is surjective (onto) and preserves inner products, i.e. {U(v), U(w)) = {u,w) for every v,w&V. Prove that U is linear and hence unitary. POSITIVE AND POSITIVE DEFINITE OPERATORS 13.89. Show that the sum of two positive (positive definite) operators is positive (positive definite). 13.90. Let r be a linear operator on V and let f-.V^-V-^K be defined by f{u, v) = (T{u), v). Show that / is itself an inner product on V if and only if T is positive definite. 13.91. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kl + E is positive (positive definite) if A; a= (A; > 0). 13.92. Prove Theorem 13.13B, page 288, on positive definite operators. (The corresponding Theorem 13.13A for positive operators is proved in Problem 13.41.) 13.93. Consider the operator T defined by T(Ui) = VTjMi, i = 1, . . .,«, in the proof of Theorem 18^3A (Problem 13.41). Show that T is positive and that it is the only positive operator for which T^ - P. 13.94. Suppose P is both positive and unitary. Prove that P = /. 13.95. An « X M (real or complex) matrix A = (ajj) is said to be positive if A viewed as a linear operator on K» is positive. (An analogous definition defines a positive definite matrix.) Prove A is positive (positive definite) if and only if ay = a^ and n 2 ayiCjSc] - (>0) i,3 = 1 for every (a;i, ...,«;„) in K^. 13.96. Determine which of the following matrices are positive (positive definite): (i) (ii) (iii) (iv) (v) (vi) 13.97. Prove that a 2 X 2 complex matrix A = [ ) is positive if and only if (i) A= A*, and (ii) a, d and ad — be are nonnegative real numbers. 13.98. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a nonnegative (positive) real number. SELF-ADJOINT AND SYMMETRIC OPERATORS 13.99. For any operator T, show that T + T* is self -adjoint and T - T* is skew-adjoint. 13.100. Suppose T is self-adjoint. Show that THv) = implies T(v) = 0. Use this to prove that THv) = also implies T(v) = lor w > 0. CHAP. 13] INNER PRODUCT SPACES 311 13.101. Let F be a complex inner product space. Suppose (T(v), v) is real for every v G V. Show that T is self-adjoint. 13.102. Suppose S and T are self-adjoint. Show that ST is self-adjoint if and only if S and T commute, i.e. ST = TS. 13.103. For each of the following symmetric matrices A, And an orthogonal matrix P for which P'AP is diagonal: 13.104. Find an orthogonal transformation of coordinates which diagonalizes each quadratic form: (i) q{x, y) = 2x^ — 6xy + lOy^, (ii) q{x, y) = x'^ -\- Sxy — 5y^ 13.105. Find an orthogonal transformation of coordinates which diagonalizes the quadratic form q(x, y, z) = 2xy + 2xz + 2yz. NORMAL OPERATORS AND MATRICES /2 i\ 13.106. Verify that A = I . 1 is normal. Find a unitary matrix P such that P*AP is diagonal, and find P*AP. ^.* ''^ 13.107. Show that a triangular matrix is normal if and only if it is diagonal. 13.108. Prove that if T is normal on V, then ||r('y)|| = ||r*('u)|| for every vGY. Prove that the converse holds in complex inner product spaces. 13.109. Show that self-adjoint, skew-adjoint and unitary (orthogonal) operators are normal. 13.110. Suppose T is normal. Prove that: (i) T is self-adjoint if and only if its eigenvalues are real. (ii) T is unitary if and only if its eigenvalues have absolute value 1. (iii) T is positive if and only if its eigenvalues are nonnegative real numbers. 13.111. Show that if T is normal, then T and T* have the same kernel and the same image. 13.112. Suppose S and T are normal and commute. Show that S+T and ST are also normal. 13.113. Suppose T is normal and commutes with S. Show that T also commutes with S*. 13.114. Prove: Let S and T be normal operators on a complex finite dimensional vector space V. Then there exists an orthonormal basis of V consisting of eigenvectors of both S and T. (That is, S and T can be simultaneously diagonalized.) ISOMORPHISM PROBLEMS 13.115. Let {ei, . . . , e„} be an orthonormal basis of an inner product space V over K. Show that the map V ^[v]g is an (inner product space) isomorphism between V and X". (Here [v]^ denotes the co- ordinate vector of v in the basis {cj}.) 13.116. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the same dimension. 13.117. Suppose {ej, ...,ej and {ei . . . , e^} are orthonormal bases of V and W respectively. Let T : V ^ W be the linear map defined by T{ei) = e(, for each i. Show that T is an isomorphism. 312 INNER PRODUCT SPACES [CHAP. 13 13.118. Let V be an inner product space. Recall (pag:e 283) that each uG V determines a linear functional u in the dual space V* by the definition u (v) = {v, u) for every v E.V. Show that the map M H- M is linear and nonsingular, and hence an isomorphism from V onto V*. 13.119. Consider the inner product space V of Problem 13.54. Show that V is isomorphic to R"*" under the mapping /«U «12 •• • «ln \ \^m\ ^m2 • • • ^mnl where Ri = (ctii, ajj, . . .,«i„), the fth row of A. MISCELLANEOUS PROBLEMS 13.120. Show that there exists an orthonormal basis {mj, . . .,m„} of V consisting of eigenvectors of T if and only if there exist orthogonal projections Ei,...,E^ and scalars \i,...,\ such that: (i) T = \iEi + • • ■ + X^^; (ii) Ei+ ■■■ + Er = I; (iii) EiEj = for i ¥■ j. 13.121. Suppose V = U®W and suppose TiiU^V and T^:W-^V are linear. Show that T = Ti © T2 is also linear. (Here T is defined as follows: if t) e V and v = u + w where uG:U,wG.W, then T(v) = T.^(u) + TgC^)-) 13.122. Suppose U is an orthogonal operator on R3 with positive determinant. Show that U is either a rotation or a reflection through a plane. Answers to Supplementary Problems 13.47. (ii) /c > 9; (iii) a > 0, d > 0, ad - 6c > 13.48. (i) VE, (ii) VH 13.50. (i) 3V2, (ii) 5V2 13.56. (i) IJMll =V65/12, (ii) ||i;|| = 2\/ll, (iii) \\f{t)\\ = ^/83/15, (iv) ||A|| = V30 13.60. (i) cos e = 9/\/420, (ii) cos e = \/l5/6, (iii) cos e - 2/y/2\a 13.61. {vi = (1,2,1,0), v^ = (4,4,0,1)} 13.62. {vi = (1, i, 1)/VS, V2 = (2i, 1 - Si, 3 - i)/V24} 13.63. (i) {/i it) = 7t2 - 5«, fzit) = 12*2 _ 5} (ii) {Mj(t) = 1, M2(t) = (2t - l)/\/3, M3(t) = (6*2 - 6f + 1)/a/5 } , r- / 7/V6 13.67. (i) (0,l/\/2, 1/^/2), (ii) (26 + 7t, 27 + 24t)/V14, (iii) V5 tVS, (iv) i _ r- ...^ 13.73. r*(a;, y, z) - (« + 82/, 2a; + x, -4j/) 13.74. T*{«,y,z) = (-iae + Sj/, (2 - 3t)a; + (2 + 5i)z, (3 + % - iz) CHAP. 13] INNER PRODUCT SPACES 313 13.75. Let u = ^(ei)ei + • • ■ + 0(e„)e„ where {ej} is an orthonormal basis. (i) u = (1, 2, -3), (ii) u = i-i, 2 - 3i, 1 + 20, (iii) u = (18t2 -8t + 13)/15 ^ / \2/\/6 -1/V6 -l/x/e/ 'l/3 2/3 2/3^ 13.80. 1 2/3 -2/3 1/3 ,2/3 1/3 -2/3/ / l/x/3 (l-0/V3\ / * *^ *"*'■' 13.96. Only (i) and (v) are positive. Moreover, (v) is positive definite. 1.103. (i) p =. ( 2/^ -1/f y (ii) p = f 2/f -i/^y (iii) p = I ^'^ -^'^ -l/\/5 2/V^/ \-l/\/5 2/\/5/ \-l/v^ 3/\/T0 13.104. (i) X = (3a;' - j/')/\/lO, v = (a:' + 3j/')/\/l0, (ii) x = (2a:' - 2/')/\/5, y = (x' + 2j/')/a/5 13.105. a; = x'Jy/l + y'lyfl + 2'/v/6, y = a;'/v/3 - j/'/^/B + z't^f^, z = a;'/\/3 - ^z'l^T^ [llyfi -l/\/2\ /2 + i 13.106. P = V V . ^*^P = ('■^ ' \1/V2 1/a/2/' 2 - i Appendix A Sets and Relations SETS, ELEMENTS Any well defined list or collection of objects is called a set; the objects comprising the set are called its elements or members. We write p G A if p is an element in the set A If every element of A also belongs to a set B, i.e. if a; G A implies x G B, then A is called a subset of B or is said to be contained in B; this is denoted by AgB or BdA Two sets are equal if they both contain the same elements; that is, A = B if and only if AcB and BcA The negations of p GA, AcB and A — B are written p ^ A, A^B and A¥'B respectively. We specify a particular set by either listing its elements or by stating properties which characterize the elements in the set. For example, A - (1,3,5,7,9} means A is the set consisting of the numbers 1, 3, 5, 7 and 9; and B — (a; : a; is a prime number, x < 15} means that B is the set of prime numbers less than 15. We also use special symbols to denote sets which occur very often in the text. Unless otherwise specified: N = the set of positive integers: 1, 2, 3, ... ; Z = the set of integers: ...,—2,-1,0,1,2,...; Q = the set of rational numbers; R = the set of real numbers; C = the set of complex numbers. We also use to denote the empty or null set, i.e. the set which contains no elements; this set is assumed to be a subset of every other set. Frequently the members of a set are sets themselves. For example, each line in a set of lines is a set of points. To help clarify these situations, we use the words cUiss, collection and family synonymously with set. The words subclass, subcoUection and subfamily have meanings analogous to subset. Example A.l : The sets A and B above can also be written as A = {a; G N : a; is odd, a; < 10} and B = {2,3,5,7,11,13} Observe that 9GA but 9 g B, and 11GB but 11 € A; whereas 3GA and 3 GB, and 6 € A and 6 g B. Example A.2 Example A.3 Example A.4 The sets of numbers are related as follows: NcZcQcRcC. Let C = {x : x^ = 4, X is odd}. Then C = 0, that is, C is the empty set. The members of the class {{2, 3}, {2}, {5, 6}} are the sets {2, 3}, {2} and {5, 6}. 315 316 SETS AND RELATIONS [APPENDIX A The following theorem applies. Theorem A.l: Let A, B and C be any sets. Then: (i) Ac A; (ii) if AcB and B(zA, then A = B; and (iii) if A c B and BcC, then AcC. We emphasize that AcB does not exclude the possibility that A = B. However, if AcB but A¥' B, then we say that A is a proper subset of B. (Some authors use the symbol c for a subset and the symbol c only for a proper subset.) When we speak of an indexed set {at: i G I}, or simply {Oi}, we mean that there is a mapping ^ from the set / to a set A and that the image 4>{i) oi i & I is denoted (M. The set / is called the indexing set and the elements a, (the range of <j>) are said to be indexed by /. A set (tti, a2, . . . } indexed by the positive integers N is called a sequence. An indexed class of sets {Ai : i G I), or simply (Ai), has an analogous meaning except that now the map 4, assigns to each i G I a set Ai rather than an element a,. SET OPERATIONS Let A and B be arbitrary sets. The union of A and B, written AuB, is the set of elements belonging to A or to B; and the intersection of A and B, written AnB, is the set of elements belonging to both A and B: AUB - {x: xG A or xGB} and AnB = {x: x GA and x G B} If AnB = 0, that is, if A and B do not have any elements in common, then A and B are said to be disjoint. We assume that all our sets are subsets of a fixed universal set (denoted here by U). Then the complement of A, written A'=, is the set of elements which do not belong to A: A<= = {X gU: x^A) ExamplA AJ5: The following diagrams, called Venn diagrams, illustrate the above set operations. Here sets are represented by simple plane areas and U, the universal set, by the area in the entire rectangle. AuB is shaded AnB is shaded QD A<^ is shaded APPENDIX A] SETS AND RELATIONS 317 Sets under the above operations satisfy various laws or identities which are listed in the table below. In fact, we state Theorem A.2: Sets satisfy the laws in Table 1. LAWS OF THE ALGEBRA OF SETS la. Idempotent Laws AuA = A lb. AnA = A 2a. Associative Laws (AuB)uC - Au(BuC) 2b. (AnB)nC = An(BnC) 3a. Commutative Laws AuB - BuA 3b. AnB = BnA 4a. Distributive Laws Au(BnC) = (AuB)n(AuC) 4b. An(BuC) = (AnB)u(AnC) 5a. 6a. Identity Laws Au0 = A 5b. AnU = A AdU = U 6b. An0 = 7a. 8a. Complement Laws AuA= = U 7b. AnA<; = (A':)<= = A 8b. U<: = 0, 0c = [7 9a. De Morgan's Laws (A\jBY = A'ni?» 9b. {AnBY = A<'uB<= Table 1 Remark: Each of the above laws follows from an analogous logical law. For example, AnB = [x: xGA and x G B} = {x: xGB and x G A) = BnA (Here we use the fact that the composite statement "p and q", written p a g, is logically equivalent to the composite statement "q and p", i.e. q a p.) The relationship between set inclusion and the above set operations follows. Theorem A.3: Each of the following conditions is equivalent to AcB: (i) AnB = A (iii) B<^cA<^ (v)BuA==C7 (ii) AuB = i? (iv) AnB' = We generalize the above set operations as follows. Let [Ai : i S 7} be any family of sets. Then the union of the Ai, written U^^,A^ (or simply UjAi), is the set of elements each belonging to at least one of the Ai; and the intersection of the At, written n^^^A^ or simply n i Ai, is the set of elements each belonging to every Ai. PRODUCT SETS Let A and B be two sets. The product set of A and B, denoted hy AxB, consists of all ordered pairs (a, 6) where aG A and b G B: AxB = {{a, b): aGA, bGB} The product of a set with itself, say Ax A, is denoted by A". 318 SETS AND RELATIONS [APPENDIX A Example A.6: The reader is familiar with the cartesian plane R^ = R x R as shown below. Here each point P represents an ordered pair {a, b) of real numbers, and vice versa. -3 ---1 • P 1 2^3 Example A.7: Let A = {1, 2, 3} and B = {a, 6}. Then AXB = {(1, a), (1, 6), (2, a), (2, 6), (3, a), (3, b)} Remark: The ordered pair (a, 6) is defined rigorously by {a, b) = {{a}, {a, b}}. From this definition, the "order" property may be proven; that is, {a, b) = (c, d) if and only if a = c and b = d. The concept of product set is extended to any finite number of sets in a natural way. The product set of the sets Ai, . . . , Am, written Ai x A2 x • • • x Am, is the set consisting of all TO-tuples (ci, a2, . . . , fflm) where ai G A. for each i. RELATIONS A binary relation or simply relation R from a set A to a set B assigns to each ordered pair {a,b) G Ax B exactly one of the following statements: (i) "a is related to b", written aRb, (ii) "a is not related to b", written a^b. A relation from a set A to the same set A is called a relation in A. Example A.8: Set inclusion is a relation in any class of sets. For, given any pair of sets A and B, either A cB or A ():B. Observe that any relation R from A to B uniquely defines a subset R of Ax B as follows: R = {(a, b): aRb} Conversely, any subset R ot AxB defines a relation from A to B as follows: aRb if and only if (a, b) G R In view of the above correspondence between relations from A to B and subsets of A x B, we redefine a relation as follows: Definition: A relation i? from A to B is a subset of A x 5. EQUIVALENCE RELATIONS A relation in a set A is called an equivalence relation if it satisfies the following axioms: [El] Every a € A is related to itself. [E2] If a is related to 6, then b is related to a. [Es] If a is related to b and b is related to c, then a is related to c. In general, a relation is said to be reflexive if it satisfies [Ei], symmetric if it satisfies [Ez], and transitive if it satisfies [£"3]. In other words, a relation is an equivalence relation if it is reflexive, symmetric and transitive. APPENDIX A] SETS AND RELATIONS 319 Example A.9: Consider the relation C of set inclusion. By Theorem A.l, A c A for every set A; and a AqB and B<zC, then A c C. That is, C is both reflexive and transitive. On the other hand, C is not symmetric, since A c B and A ¥= B implies B cjiA. Example A.IO: In Euclidean geometry, similarity of triangles is an equivalence relation. For if a, p and V are any triangles, then: (i) a is similar to itself; (ii) if a is similar to yS, then p is similar to a; and (iii) if a is similar to p and /8 is similar to y, then a is similar to y. If R is an equivalence relation in A, then the equivalence class of any element a G A, denoted by [a], is the set of elements to which a is related: [a] = [x: aRx) The collection of equivalence classes, denoted by A/R, is called the quotient of A by R: AIR = {[a]: a G A} The fundamental property of equivalence relations follows: Theorem A.4: Let R be an equivalence relation in A. Then the quotient set A/R is a partition of A, i.e. each aGA belongs to a member of A/R, and the mem- bers of A/R are pairwise disjoint. Example A.11 : Let R5 be the relation in Z, the set of integers defined by X = y (mod 5) which reads "x is congruent to y modulo 5" and which means "x - y is divisible by 5". Then ^5 is an equivalence relation in Z. There are exactly five distinct equivalence classes in Z/R^: Ao == {...,-10,-5,0,5,10} Ai = {...,-9,-4,1,6,11} A2 = {...,-8,-3,2,7,12} A3 = {...,-7,-2,3,8,13} A4 = {...,-6,-1,4,9,14} Now each integer x is uniquely expressible in the form x = 5q + r where - r < 5; observe that x G E^ where r is the remainder. Note that the equivalence classes are pairwise disjoint and that Z = A0UA1UA2UA3UA4. Appendix B Algebraic Structures INTRODUCTION We define here algebraic structures which occur in almost all branches of mathematics. In particular we will define a field which appears in the definition of a vector space. We begin with the definition of a group, which is a relatively simple algebraic structure with only one operation and is used as a building block for many other algebraic systems. GROUPS Let G be a nonempty set with a binary operation, i.e. to each pair of elements a,b GG there is assigned an element ab G G. Then G is called a group if the following axioms hold: [Gi] For any a,b,c G G, we have {ab)c = a{bc) (the associative law). [G2] There exists an element e GG, called the identity element, such that ae = ea = a for every a GG. [Ga] For each a GG there exists an element a'^GG, called the inverse of a, such that aa~^ = a~^a = e. A group G is said to be abelian (or: commutative) if the commutative law holds, i.e. if ab = ha for every a,h GG. When the binary operation is denoted by juxtaposition as above, the group G is said to be written multiplicatively. Sometimes, when G is abelian, the binary operation is de- noted by + and G is said to be written additively. In such case the identity element is denoted by and is called the zero element; and the inverse is denoted by —a and is called the negative of a. If A and B are subsets of a group G then we write AB = {ab: aGA, bGB}, or A + B = {a + b: a G A, b G B} We also write a for {a}. A subset H of a group G is called a subgroup of G if H itself forms a group under the operation of G. If ^ is a subgroup of G and aGG, then the set Ha is called a right coset of H and the set aH is called a left coset of H. Definition: A subgroup H of G is called a normal subgroup if a-^HacH for every aGG. Equivalently, H is normal if aH = Ha for every aGG, i.e. if the right and left cosets of H coincide. Note that every subgroup of an abelian group is normal. Theorem B.1: Let f? be a normal subgroup of G. Then the cosets of i? in G form a group under coset multiplication. This group is called the quotient group and is denoted by G/H. Example B.1 : The set Z of integers forms an abelian group under addition. (We remark that the even integers form a subgroup of Z but the odd integers do not.) Let H denote the set of multiples of 5, i.e. H = {. . ., -10, -5, 0, 5, 10, . . .}. Then H is a subgroup (necessarily normal) of Z. The cosets of H in Z follow: 320 APPENDIX Bj ALGEBRAIC STRUCTURES 321 Example B^: = + H = H = {...,-10,-5,0,5,10,...} 1 = 1 + i? = {. . ., -9, -4, 1, 6, 11, . . .} 2 = 2 + H = {..., -8, -3, 2, 7, 12, . . .} 3 = 3 + H = {. . ., -7, -2, 3, 8, 13, . . .} 4 = 4, + H = {...,-6,-1,4,9,14, ...} For any other integer w S Z, n = w + H coincides with one of the above cosets. Thus by the above theorem, Z/H = {0, 1, 2, 3, 4} forms a group under coset addition; its addition table follows: + T 2 3 4 T 2 3 4 T T 2 3 4 2 2 3 4 1 3 3 4 I 2 4 4 1 2 3 This quotient group Z/H is referred to as the integers modulo 5 and is frequently denoted by Z5. Analogously, for any positive integer n, there exists the quotient group Z„ called the integers modulo n. The permutations of n symbols (see page 171) form a group under composition of mappings; it is called the symmetric group of degree n and is denoted by S„. We investigate S3 here; its elements are <'i "2 "a <t>i 02 — (3 a \) (III) Here n 2 3\ f . . , 1 is the permutation which maps 1 ►-» t, 2 I-* j, 3 } k tiplication table of S3 is The mul- e <'l «'2 "3 01 02 6 e "1 "2 "3 01 02 "1 "1 € 01 02 "2 ''S "2 "2 H e 01 ''S "1 "3 "3 01 02 e "1 "2 01 •Pi •'a <fl "2 02 e <f>2 02 "2 "3 <'l € 01 (The element in the ath row and 6th column is ab.) The set H = {e, ffj is a sub- group of S3; its right and left cosets are Right Cosets Left Cosets H = {e,<r,} H = {e,„,} H^l = {01, 02} <f>iH — {^j, org} H<t>2 = {02, tfs} 02-^ = {02><'2} Observe that the right cosets and the left cosets are distinct; hence H is not a normal subgroup of S3. A mapping / from a group G into a group G' is called a homomorphism if /(a6) = /(a)/(b) for every a.bGG. (If / is also bijective, i.e. one-to-one and onto, then / is called an isomorphism and G and G' are said to be isomorphic.) If f:G-*G' is a homomorphism, then the feerraei of / is the set of elements of G which map into the identity element e' e G': kernel of / = {aGG: f(a) = e'} (As usual, /(G) is called the image of the mapping /: G^G'.) The following theorem applies. Theorem B.2: Let /: G-» G' be a homomorphism with kernel K. Then X is a normal subgroup of G, and the quotient group GIK is isomorphic to the image of /. 322 ALGEBRAIC STRUCTURES [APPENDIX B Example B^: Let G be the group of real numbers under addition, and let G' be the group of positive real numbers under multiplication. The mapping f : G -* G' defined by /(a) — 2" is a homomorphism because f(a+b) = 2° + " = 2''2i' = f{a)f(b) In particular, / is bijective; hence G and G' are isomorphic. Example B.4: Let G be the group of nonzero complex numbers under multiplication, and let G' be the group of nonzero real numbers under multiplication. The mapping f : G -* G' defined by f(z) — \z\ is a homomorphism because /(ziZa) = |ziZ2| = |zi| [zal = /(^i) f(H) The kernel K of f consists of those complex numbers z on the unit circle, i.e. for which \z\ = 1. Thus G/K is isomorphic to the image of /, i.e. to the group of positive real numbers under multiplication. RINGS, INTEGRAL DOMAINS AND FIELDS Let i? be a nonempty set with two binary operations, an operation of addition (denoted by +) and an operation of multiplication (denoted by juxtaposition). Then R is called a ring if the following axioms are satisfied: [Ri] For any a,b,e G R, we have {a + h) + c = a + (6 + c). [Ri] There exists an element G /?, called the zero element, such that a + = + a = a for every aGR. [Rs\ For each a G J? there exists an element —a G R, called the negative of a, such that a + (—a) = (—a) + a = 0. [R^ For any a,b G R, we have a + b = b + a. [Rs] For any a,b,cG R, we have {ab)c = a{bc). [Re] For any a,b,c G R, we have: (i) a{b + c) = ab + ac, and (ii) (b + e)a =ba + ca. Observe that the axioms [Ri] through [Rt] may be summarized by saying that R is an abelian group under addition. Subtraction is defined iniJby a — b = a + (— &). It can be shown (see Problem B.25) that a- = • a = for every a G R. R is called a commutative ring if ab — ba for every a,b G R. We also say that R is a ring with a unit element if there exists a nonzero element 1 G R such that o • 1 = 1 • a = o for every a G R. A nonempty subset S of i? is called a subring of R if S itself forms a ring under the operations of R. We note that S is a subring of R if and only if a, & G S implies a-b GS and ab G S. A nonempty subset / of jB is called a left ideal in R if: (i) a — 6 G / whenever a,b G I, and (ii) ra G I whenever r GR, aG I. Note that a left ideal I in R is also a subring of R. Similarly we can define a right ideal and a two-sided ideal. Clearly all ideals in com- mutative rings are two-sided. The term ideal shall mean two-sided ideal unless otherwise specified. Theorem B.3: Let / be a (two-sided) ideal in a ring R. Then the cosets {a + I: aGR} form a ring under coset addition and coset multiplication. This ring is denoted by R/I and is called the quotient ring. Now let R he a commutative ring with a unit element. For any aGR, the set (a) = {ra: r G R} is an ideal; it is called the principal ideal generated by a. If every ideal in iZ is a principal ideal, then R is called a principal ideal ring. Definition: A commutative ring R with a unit element is called an integral domain it R has no zero divisors, i.e. if ab — implies a = or b = 0. APPENDIX B] ALGEBRAIC STRUCTURES 323 Definition: A commutative ring R with a unit element is called a field if every nonzero a E R has a multiplicative inverse, i.e. there exists an element a~^ E R such that aa~i = a~^a — 1. A field is necessarily an integral domain; for if ab = and a t^ 0, then b = I'b = a-^ab - a-^'O = We remark that a field may also be viewed as a commutative ring in which the nonzero elements form a group under multiplication. Example B5: The set Z of integers with the usual operations of addition and multiplication is the classical example of an integral domain with a unit element. Every ideal / in Z is a principal ideal, i.e. / = (m) for some integer n. The quotient ring Z„ = Z/(ji) is calle^ the ring of integers modulo n. If n is prime, then Z„ is a field. On the other hand, if n is not prime then Z„ has zero divisors. For example, in the ring Zg, 2 3=0 and 2^0 and 3 # 0. Example B.6: The rational numbers Q and the real numbers R each form a field with respect to the usual operations of addition and multiplication. Example B.7: Let C denote the set of ordered pairs of real numbers with addition and multiplica- tion defined by (a, 6) + (c, d) = (a + e,b + d) (a, 6) '(c, d) = {ae —bd, ad + be) Then C satisfies all the required properties of a field. In fact, C is just the field of complex numbers (see page 4). Example B.8: The set M of all 2 by 2 matrices with real entries forms a noncommutative ring with zero divisors under the operations of matrix addition and matrix multiplication. Example B.9: Let R be any ring. Then the set jB[a;] of all polynomials over R forms a ring with respect to the usual operations of addition and multiplication of polynomials. Moreover, if R is an integral domain then R[x] is also an integ^ral domain. Now let D be an integral domain. We say that b divides a in D if a = bc for some c G D. An element u G D ia called a unit if u divides 1, i.e. if u has a multiplicative inverse. An element b GD is called an associate of a G D if b = ua for some unit uG D. A nonunit p G D is said to be irreducible if p = ab implies a or 6 is a unit. An integral domain D is called a unique factorization domain if every nonunit a G D can be written uniquely (up to associates and order) as a product of irreducible elements. Example BJO: The ring Z of integers is the classical example of a unique factorization domain. The units of Z are 1 and —1. The only associates of n G. Z are n and —n. The irreducible elements of Z are the prime numbers. Example B.ll : The set D = {a+ b^/JS : a, b integers} is an integral domain. The units of D are ±1, 18 ± 5^13 and -18 ± 5\/l3. The elements 2, 3 - Vl3 and -3 - Vl3 are irreducible in D. Observe that 4 = 2 • 2 = (3 - Vl3 )(— 3 - Vi3 ). Thus D is not a unique factorization domain. (See Problem B.40.) MODULES Let M be a nonempty set and let Rhe a. ring with a unit element. Then M is said to be a (left) R-module if M is an additive abelian group and there exists a mapping RxM-* M which satisfies the following axioms: 324 ALGEBRAIC STRUCTURES [APPENDIX B [Ml] r(mi + mz) — rwi + rm2 [Mz] (r + s)m = rm + sm [Ma] {rs)m = r{sm) [M4] I'm — m for any r,s GR and any mi G M. We emphasize that an JR-module is a generalization of a vector space where we allow the scalars to come from a ring rather than a field. Example B.12: Let G be any additive abelian group. We make G into a module over the ring Z of integers by defining n times nff = ff + ff+---+ff, Oflr = 0, {-n)ff = -ng where n is any positive integer. Example B.13: Let iJ be a ring and let / be an ideal in R. Then / may be viewed as a module over R. Example B.14: Let V be a vector space over a field K and let T : y -» V be a linear mapping. We make V into a module over the ring K[x\ of polynomials over K by defining f(x)v = f(T) (v). The reader should check that a scalar multiplication has been defined. Let iJf be a module over R. An additive subgroup AT of iW is called a submodule of M it uGN and kGR imply ku G N. (Note that N is then a module over R.) Let M and M' be /2-modules. A mapping T : M-* M' is called a homomorphism (or: R-homomorphism or R-linear) if (i) T{u + v) = T{u) + T(v) and (ii) T{ku) = kT{u) for every u,v G M and every kGR. Problems GROUPS BJ. Determine whether each of the following systems forms a group G: (i) G — set of integers, operation subtraction; (ii) G = {1, —1}, operation multiplication; (iii) G = set of nonzero rational numbers, operation division; (iv) G = set of nonsingular nXn matrices, operation matrix multiplication; (v) G = {a+bi: a,b e Z}, operation addition. B.2. Show that in a group G: (i) the identity element of G is unique; (ii) each a S G has a unique inverse a~^ G G; (iii) (o-i)-i = a, and (ab)-^ = fe-ia-i; (iv) 0.6 = ac implies b = c, and 6a = ca implies 6 = c. B.3. In a group G, the powers of a G G are defined by a" = e, a" = aa"~i, o~" = (a")~i, where nGN Show that the following formulas hold for any integers r,8,tS Z: (i) a^'a^ — a^+'^ (ii) (a"")' = a", (iii) («>•+«)« = a'-t+st. APPENDIX B] ALGEBRAIC STRUCTURES 325 B.4. Show that if G is an abelian group, then (a&)» = a^bn for any a, 6 G G and any Integer « G Z. B.5. Suppose G is a group such that {ab)^ = a^b^ for every a, 6 G G. Show that G is abelian. B.6. Suppose if is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is non- empty, and (ii) a,b G H implies o6~i G H. B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G. B.8. Show that the set of all powers of a G G is a subgroup of G; it is called the cyclic group generated by a. B.9. A group G is said to be cyclic if G is generated by some aG G, i.e. G = {a." : n G Z}. Show that every subgroup of a cyclic group is cyclic. B.IO. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition or to the set Z„ (of the integers modulo n) under addition. B.ll. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint subsets. B.12. The order of a group G, denoted by |G|, is the number of elements of G. Prove Lagrange's theorem: If H is a subgroup of a finite group G, then \H\ divides \G\. B.13. Suppose \G\ — p where p is prime. Show that G is cyclic. B.14, Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and (ii) HnN is a normal subgroup of G. B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G. B.16. Prove Theorem B.l: Let H he a. normal subgroup of G. Then the cosets of H in G form a group G/H under coset multiplication. B.17. Suppose G is an abelian group. Show that any factor group G/H is also abelian. B.18. Let f : G -* G' be a group homomorphism. Show that: (i) /(e) = e' where e and e' are the identity elements of G and G' respectively; (ii) /(a-i) = /(a)-i for any a G G. B.19. Prove Theorem B.2: Let f : G -* G' be a group homomorphism with kernel K. Then K is a normal subgroup of G, and the quotient group G/K is isomorphic to the image of /. B.20. Let G be the multiplicative group of complex numbers z such that \z\ = 1, and let B be the additive group of real numbers. Prove that G is isomorphic to R/Z. B.21. For a fixed fir G G, let g : G ^ G be defined by g(a) = g-^ag. Show that G is an isomorphism of G onto G. B.22. Let G be the multiplicative group of n X w nonsingular matrices over R. Show that the mapping A h* |A| is a homomorphism of G into the multiplicative g:roup of nonzero real numbers. B.23. Let G be an abelian group. For a fixed w G Z, show that the map a l-» a" is a homomorphism of G into G. B.24. Suppose H and N are subgroups of G with N normal. Prove that HnN is normal in H and H/{HnN) is isomorphic to HN/N. RINGS B.25. Show that in a ring R: (i) o • = • a. = 0, (ii) o(-6) = (-a)b = -ab, (iii) (-o)(-6) = ab. B.26. Show that in a ring R with a unit element: (i) (— l)a = —a, (ii) (— 1)(— 1) = 1. 326 ALGEBRAIC STRUCTURES [APPENDIX B B.27. Suppose a^ = a for every a e i?. Prove that i? is a commutative ring. (Such a ring is called a Boolean ring.) B.28. Let i? be a ring with a unit element. We make R into another ring R by defining a®b = a+b + l and^ U'b = ab + a+b. (i) Verify that fi is a ring, (ii) Determine the 0-element and 1-element of R. B.29. Let G be any (additive) abelian group. Define a multiplication in G by a-b = 0. Show that this makes G into a ring. B.30. Prove Theorem B.3: Let / be a (two-sided) ideal in a ring R. Then the cosets {a + I:. a G R} form a ring under coset addition and coset multiplication. B.31. Let /j and I^ be ideals in R. Prove that /j + 1^ and hnlz are also ideals in R. B.32. Let R and R' be rings. A mapping f : R ^ R' is called a homomorphism (or: ring homomorphism) if (i) f(a + b) =■ f(a) + f{b) and (ii) f(ab) = /(a)/(6), for every a,bGR. Prove that if f : R ^ R' is a homomorphism, then the set K = {r G R : f{r) = 0} is an ideal in R. (The set K is called the kernel of /.) INTEGRAL DOMAINS AND FIELDS B.33. Prove that in an integral domain D, if ab = ae, a ¥= 0, then b = c. B.34. Prove that F = {a + byji : a, b rational} is a field. B.35. Prove that D = {a+ 6\/2 : a, b integers} is an integral domain but not a field. B.36. Prove that a finite integral domain D is a field. B.37. Show that the only ideals in a field K are {0} and K. B.38. A complex number a + bi where a, b are integers is called a Gaussian integer. Show that the set G of Gaussian integers is an integral domain. Also show that the units in G are ±1 and ±i. B.39. Let D be an integral domain and let / be an ideal in D. Prove that the factor ring D/I is an integral domain if and only if / is a prime ideal. (An ideal / is prime if ab G I implies aG I or b&I.) B.40. Consider the integral domain D = {a + b\/l3 : a, b integers} (see Example B.ll). If a — a+ by/Ts , we define N(a) = a^-lSb^. Prove: (1) NiaP) = N{a)N(fi); (ii) a is a unit if and only if N(a) = ±1; (iii) the units of D are ±1, 18 ± 5-\/l3 and -18 ± 5\/l3 ; (iv) the numbers 2, 3 - a/13 and -3 - y/Ts are irreducible. MODULES B.41. Let M be an iJ-module and let A and B be submodules of M. Show that A+B and AnB are also submodules of M. B.42. Let M be an iZ-module with submodule N. Show that the cosets {u + N : u G M} form an iJ-module under coset addition and scalar multiplication defined by r{u + N) = ru + N. (This module is de- noted by M/N and is called the quotient module.) B.43. Let M and M' be B-modules and let f : M -* M' be an iZ-homomorphism. Show that the set K = {uGM: f(u) = 0} is a submodule of /. (The set K is called the kernel of /.) B.44. Let M be an i?-module and let E(M) denote the set of all fi-homomorphism of M into itself. Define the appropriate operations of addition and multiplication in E{M) so that E(M) becomes a ring. Appendix C Polynomials over a Field INTRODUCTION We will investigate polynomials over a field K and show that they have many properties which are analogous to properties of the integers. These results play an important role in obtaining canonical forms for a linear operator T on a vector space V over K. RING OF POLYNOMIALS Let X be a field. Formally, a polynomial / over K is an infinite sequence of elements from K in which all except a finite number of them are 0: / = ( . . . , 0, On, . . . , ai, tto) (We write the sequence so that it extends to the left instead of to the right.) The entry ak is called the kth coefficient of /. If n is the largest integer for which a„ ¥- 0, then we say that the degree of / is n, written deg/ = n We also call a„ the leading coefficient of /, and if a„ = 1 we call / a monic polynomial. On the other hand, if every coefficient of / is then / is called the zero polynomial, written / = 0. The degree of the zero polynomial is not defined. Now if g is another polynomial over K, say g - {.. .,0,bm, . . .,bi,bo) then the sum f + g is the polynomial obtained by adding corresponding coefficients. That is, if m — n then f + g = ( . . . , 0, a„, . . . , a™ + 6m, . . . , ai + 6i, ao + bo) Furthermore, the product fg is the polynomial fg = ( . . . , 0, anbm, . . . , aibo + aobi, Oobo) that is, the kth coefl5cient Ck of fg is Cfc = 2^ ttibk-i = aobk + aibk-i + • • • + akbo i=0 The following theorem applies. Theorem C.l: The set P of polynomials over a field K under the above operations of addi- tion and multiplication forms a commutative ring with a unit element and with no zero divisors, i.e. an integral domain. If / and g are nonzero polynomials in P, then deg (fg) = (deg /)(deg g). 327 328 POLYNOMIALS OVER A FIELD [APPENDIX C NOTATION We identify the scalar ao G X with the polynomial ao = ( . . . , 0, tto) We also choose a symbol, say t, to denote the polynomial t = (...,0,1,0) We call the symbol t an indeterminant. Multiplying t with itself, we obtain t' = {.. ., 0, 1, 0, 0), t' = (. . ., 0, 1, 0, 0, 0), ... Thus the above polynomial / can be written uniquely in the usual form / = Unt" + ■ • • + ait + ao When the symbol t is selected as the indeterminant, the ring of polynomials over K is denoted by and a polynomial / is frequently denoted by f(t). We also view the field X as a subset of K[t] under the above identification. This is pos- sible since the operations of addition and multiplication of elements of K are preserved under this identification: (...,0, ao) + (..., 0,6o) = {...,0, ao + bo) (..., 0,ao) •(..., 0, 6o) - (...,0, aobo) We remark that the nonzero elements of K are the units of the ring K[t]. We also remark that every nonzero polynomial is an associate of a unique monic poly- nomial. Hence if d and d' are monic polynomials for which d divides d' and d' divides d, then d = d'. (A polynomial g divides a polynomial / if there is a polynomial k such that / = hg.) DIVISIBILITY The following theorem formalizes the process known as "long division". Theorem C.2 (Division Algorithm) : Let / and g be polynomials over a field K with g ¥=0. Then there exist polynomials q and r such that / = qg + r where either r = or deg r < deg g. Proof: If f — or if deg / < deg g, then we have the required representation f = Og + f Now suppose deg / — deg g, say / = Unt" + • • • +ait + ao and g = hmt^ + ■ ■ ■ + hit + bo where a„, bm ?^ and n — m. We form the polynomial Om Then deg /i < deg /. By induction, there exist polynomials qi and r such that /i = qig + r APPENDIX C] POLYNOMIALS OVER A FIELD 329 where either r = or degr < deg g. Substituting this into {1) and solving for /, / = (qi+^t^-Ag + r which is the desired representation. Theorem C.3: The ring K[t] of polynomials over a field X is a principal ideal ring. If / is an ideal in K[t], then there exists a unique monic polynomial d which gen- erates /, that is, such that d divides every polynomial / G 7. Proof. Let d be a polynomial of lowest degree in 7. Since we can multiply d by a non- zero scalar and still remain in 7, we can assume without loss in generality that d is a monic polynomial. Now suppose / G 7. By Theorem C.2 there exist polynomials q and r such that / = qd + r where either r = or deg r < deg d Now f,d G I implies qd G I and hence r = f — qd E I. But d is a polynomial of lowest degree in 7. Accordingly, r = and / = qd, that is, d divides /. It remains to show that d is unique. If d' is another monic polynomial which generates I, then d divides d' and d' divides d. This implies that d = d', because d and d' are monic. Thus the theorem is proved. Theorem C.4: Let / and g be nonzero polynomials in K[t]. Then there exists a unique monic polynomial d such that: (i) d divides / and g; and (ii) if d' divides / and g, then d' divides d. Definition: The above polynomial d is called the greatest common divisor of / and g. If d — 1, then / and g are said to be relatively prime. Proof of Theorem CA. The set 7 = {mf + ng : m,nG K[t]} is an ideal. Let d be the monic polynomial which generates 7. Note f,g G I; hence d divides / and g. Now suppose d' divides / and g. Let / be the ideal generated by d'. Then f,g GJ and hence Icj. Accordingly, d Gj and so d' divides d as claimed. It remains to show that d is unique. If di is another (monic) greatest common divisor of / and g, then d divides di and di divides d. This implies that d — di because d and di are monic. Thus the theorem is proved. Corollary C.5: Let d be the greatest common divisor of the polynomials / and g. Then there exist polynomials m and n such that d = mf + ng. In particular, if / and g are relatively prime then there exist polynomials m and n such that mf + ng = 1. The corollary follows directly from the fact that d generates the ideal 7 = {mf + ng:m,nGK[t]} FACTORIZATION A polynomial p G K[t] of positive degree is said to be irreducible if p — fg implies / or gr is a scalar. Lemma C.6: Suppose p G K[t] is irreducible. If p divides the product fg of polynomials f,g G K[t], then p divides f or p divides g. More generally, if p divides the product of n polynomials /1/2. . .fn, then p divides one of them. Proof. Suppose p divides fg but not /. Since p is irreducible, the polynomials / and p must then be relatively prime. Thus there exist polynomials m,nG K[t] such that mf + np — 1. Multiplying this equation by g, we obtain m,fg + npg = g. But p divides fg and so mfg, and p divides npg; hence p divides the sum g = mfg + npg. ?-i«i./i>'— --r-- T .Tii?-«'-TJ?»7^T^-i-- ""■"■" "■'??^'?2?CT5^-';fTr.js"'w«-5.a 830 POLYNOMIALS OVER A FIELD [APPENDIX C Now suppose p divides /1/2. . ./«. If p divides /i, then we are through. If not, then by the above result p divides the product /2. . ./«. By induction on n, p divides one of the poly- nomials A, . . . , /«. Thus the lemma is proved. Theorem C.7 (Unique Factorization Theorem): Let / be a nonzero polynomial in K[t]. Then / can be written uniquely (except for order) as a product / = kpiP2. . .Pn where k GK and the Pi are monic irreducible polynomials in K[t]. Proof: We prove the existence of such a product first. If / is irreducible or if / e K, then such a product clearly exists. On the other hand, suppose f = gh where / and g are nonscalars. Then g and h have degrees less than that of /. By induction, we can assume g - kigig2...gr and h — kihihi. . .hs where ki, kiGK and the gi and hj are monic irreducible polynomials. Accordingly, / = {kik%)gig2. . .grhjii. . .hs is our desired representation. We next prove uniqueness (except for order) of such a product for /. Suppose / = kpiP2. . .Pn = k'qiQi. . .Qm where k,k' E.K and the Pu . . ., Pn, qi, . . .,qm are monic irreducible polynomials. Now pi divides k'qi . . .qm. Since pi is irreducible it must divide one of the qi by the above lemma. Say pi divides qi. Since pi and qi are both irreducible and monic, pi = qi. Accordingly, kpi. . .pn = k'qi. . .qm By induction, we have that n = m and P2 = 92, . . ., Pn = qm for some rearrangement of the qi. We also have that k = k'. Thus the theorem is proved. If the field K is the complex field C, then we have the following result which is known as the fundamental theorem of algebra; its proof lies beyond the scope of this text. Theorem C.8 (Fundamental Theorem of Algebra): Let /(<) be a nonzero polynomial over the complex field C. Then f{t) can be written uniquely (except for order) as a product /(*) = k{t-ri){t-r2)---it-rn) where k, n G C, i.e. as a product of linear polynomials. In the case of the real field B we have the following result. Theorem C.9: Let f{t) be a nonzero polynomial over the real field R. Then f{f) can be written uniquely (except for order) as a product f{t) = kpi{t)p2{t) ■■■Pm{t) where kGK and the Pi{t) are monic irreducible polynomials of degree one or two. INDEX Abelian group, 320 Absolute value, 4 Addition, in R", 2 of linear mappings, 128 of matrices, 36 Adjoint, classical, 176 operator, 284 Algebra, isomorphism, 169 of linear operators, 129 of square matrices, 43 Algebraic multiplicity, 203 Alternating, bilinear forms, 262 multilinear forms, 178, 277 Angle between vectors, 282 Annihilator, 227, 251 Anti-symmetric bilinear form, 263 operator, 285 Augmented matrix, 40 Basis, 88 change of, 153 Bessel's inequality, 309 Bijective mapping, 123 Bilinear form, 261, 277 Binary relation, 318 Block matrix, 45 Bounded function, 65 C,4 C», 5 Cayley-Hamilton theorem, 201, 211 Canonical forms in Euclidean spaces, 288 unitary spaces, 290 vector spaces, 222 Cauchy-Schwarz inequality, 4, 10, 281 Cells, 45 Change of basis, 153 Characteristic, equation, 200 matrix, 200 polynomial, 200, 203, 210 value, 198 vector, 198 Classical adjoint, 176 Co-domain, 121 Coefficient matrix, 40 Cofactor, 174 Column, of a matrix, 35 rank, 90 space, 67 vector, 36 Companion matrix, 228 Complex numbers, 4 Components, 2 Composition of mappings, 121 Congruent matrices, 262 Conjugate complex number, 4 Consistent linear equations, 31 Convex, 260 Coordinate, 2 vector, 92 Coset, 229 Cramer's rule, 177 Cyclic group, 325 Cyclic subspaces, 227 Decomposition, direct sum, 224 primary, 225 Degenerate bilinear form, 262 Dependent vectors, 86 Determinant, 171 Determinantal rank, 195 Diagonal matrix, 43 of a matrix, 43 Diagonalization, Euclidean spaces, 288 unitary spaces, 290 vector spaces, 155, 199 Dimension, 88 Direct sum, 69, 82, 224 Disjoint, 316 Distance, 3, 280 Distinguished elements, 41 Division algorithm, 328 Domain, integral, 322 of a mapping, 121 Dot product, in C", 6 in R", 3 Dual basis, 250 space, 249 Echelon form, linear equations, 21 matrices, 41 331 332 INDEX Echelon matrix, 41 Eigenspace, 198, 205 Eigenvalue, 198 Eigenvector, 198 Element, 315 Elementary, column operation, 61 divisors, 229 matrix, 56 row operation, 41 Elimination, 20 Empty set, 315 Equality of matrices, 36 of vectors, 2 Equations (see Linear equations) Equivalence relation, 318 Equivalent matrices, 61 Euclidean space, 3, 279 Even function, 83 permutation, 171 External direct sum, 82 Field, 323 Free variable, 21 Function, 121 Functional, 249 Gaussian integers, 326 Generate, 66 Geometric multiplicity, 203 Gram-Schmidt orthogonalization, 283 Greatest common divisor, 329 Group, 320 Hermitian, form, 266 matrix, 266 Hilbert space, 280 Hom (V, U), 128 Homogeneous linear equations, 19 Homomorphism, 123 Hyperplane, 14 Ideal, 322 Identity, element, 320 mapping, 123 matrix, 43 permutation, 172 Image, 121, 125 Inclusion mapping, 146 Independent subspaces, 244 vectors, 86 Index of nilpotency, 225 set, 316 Infective mapping, 123 Inner product, 279 Inner product space, 279 Integers modulo re, 323 Integral domain, 322 Intersection of sets, 316 Invariant subspace, 223 Inverse, mapping, 123 matrix, 44, 176 Invertible, linear operator, 130 matrix, 44 Irreducible, 323, 329 Isomorphism of algebras, 169 groups, 321 inner product spaces, 286, 311 vector spaces, 93, 124 Jordan canonical form, 226 Kernel, 123, 321, 326 22-space, 280 Line segment, 14, 260 Linear combination of equations, 30 of vectors, 66 Linear dependence, 86 in R", 28 Linear equations, 18, 127, 176, 251, 282 Linear functional, 249 Linear independence, 86 in R", 28 Linear mapping, 123 matrix of, 160 rank of, 126 Linear operators, 129 Linear span, 66 Mapping, 121 linear, 123 Matrices, 35 addition, 36 augmented, 40 block, 45 change of basis, 153 coefficient, 40 column, 35 congruent, 262 determinant, 171 diagonal, 43 echelon, 41 equivalent, 61 Hermitian, 266 identity, 43 multiplication, 39 normal, 290 rank, 90 row, 35 row canonical form, 42, 68 row equivalent, 41 row space, 60 scalar, 43 scalar multiplication, 36 similar, 155 size, 35 INDEX 333 Matrices (cont.) square, 43 symmetric, 65, 288 transition, 153 transpose, 39 triangular, 43 zero, 37 Matrix representation, bilinear forms, 262 linear mappings, 150 Maximal independent set, 89 Minimal polynomial, 202, 212 Minkowski's inequality, 10 Minor, 174 Module, 323 Monic polynomial, 201 Multilinear, 178, 277 Multiplication of matrices, 37, 39 N (positive integers), 315 n-space, 2 n-tuple, 2 Nilpotent, 225 Nonnegative semi-definite, 266 Nonsingular, linear mapping, 127 matrix, 130 Norm, 279 in R«, 4 Normal operator, 286, 290, 303 Normal subgroup, 320 Normalized vector, 280 Null set, 315 Nullity, 126 Odd, function, 73 permutation, 171 One-to-one mappings, 123 Onto mappings, 123 Operations with linear mappings, 128 Operators (see Linear operators) Ordered pair, 318 Orthogonal complement, 281 matrix, 287 operator, 286 vectors, 3, 280 Orthogonally equivalent, 288 Orthonormal, 282 Parallelogram law, 307 Parity, 171 Partition, 319 Permutations, 171 Polar form, 264, 307 Polynomials, 327 Positive matrix, 310 operator, 288 Positive definite, bilinear form, 265 matrix, 272, 310 operator, 288 Primary decomposition theorem, 225 Prime ideal, 326 Principal ideal, 322 Principal minor, 219 Product set, 317 Projection operator, 243, 308 orthogonal, 281 Proper subset, 316 value, 198 vector, 198 Q (rational numbers), 315 Quadratic form, 264 Quotient, group, 320 module, 326 ring, 322 set, 319 space, 229 K (real field), 315 R", 2 Rank, bilinear form, 262 linear mapping, 126 matrix, 90, 195 Rational canonical form, 228 Relation, 318 Relatively prime, 329 Ring, 322 Row, canonical form, 42 equivalent matrices, 41 of a matrix, 35 operations, 41 rank, 90 reduced echelon form, 41 reduction, 42 vector, 36 Scalar, 2, 63 mapping, 219 matrix, 43 Scalar multiplication, 69 of linear mappings, 128 of matrices, 36 Second dual space, 251 Self -adjoint operator, 285 Set, 315 Sgn, 171 Sign of a permutation, 171 Signature, 265, 266 Similar matrices, 155 Singular mappings, 127 Size of a matrix, So Skew-adjoint operator, 285 Skew-symmetric bilinear form, 263 Solution, of linear equations, 18, 23 space, 65 Span, 66 Spectral theorem, 291 Square matrices, 43 J 334 INDEX Subgroup, 320 Subring, 322 Subset, 315 Subspace (of a vector space), 65 sum of, 68 Surjective mapping, 123 Sylvester's theorem, 265 Symmetric, bilinear form, 263 matrix, 65 operator, 285, 288, 300 System of linear equations, 19 Trace, 155 Transition matrix, 153 Transpose, of a linear mapping, 252 of a matrix, 39 Transposition, 172 Triangle inequality, 293 Triangular, form, 222 matrix, 43 Trivial solution, 19 Union of sets, 316 Unique factorization, 323 Unit vector, 280 Unitarily equivalent, 288 Unitary, matrix, 287 operator, 286 space, 279 Universal set, 316 Upper triangular matrix, 43 Usual basis, 88, 89 Vector, 63 in C", 5 in R", 2 Vector space, 63 Venn diagram, 316 Z (integers), 315 Z„ (ring of integers modulo w), 323 Zero, mapping, 124 matrix, 37 of a polynomial, 44 solution, 19 vector, 3, 63 COLLEGE PHYSICS including 625 SOLVED PROBLEMS Edited by CAREL W. van der MERWE, Pli.D., Professor of Physics, New ^ork Univtrsity COLLEGE CHEMISTRY including 385 SOLVED PROBLEMS Edited by JEROME L. ROSENBERG, Pli.D., Professor of Chemisfry, University of Pitisburgh GENETICS including 500 SOLVED PROBLEMS By WILLIAM D. STANSFIELD, Ph.D., Depf. of iiological Sciences, Calif. Slate Polyteclt. MATHEMATICAL HANDBOOK including 2400 FORMULAS and 60 TABLES By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. First Yr. COLLEGE MATHEMATICS including 1850 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathematics, Dickinson College COLLEGE ALGEBRA including 1 940 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. TRIGOmMETRY including 680 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathematics, Dickinson College MATHEMATICS OF FINANCE including 500 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathematics, Dickinson College PROBABILITY including 500 SOLVED PROBLEMS By SEYMOUR IIPSCHUTZ, Ph.D., Assoc. Prof, of Moth., 7emp/e University STATISTICS including 875 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. ANALYTIC GEOMETRY including 345 SOLVED PROBLEMS By JOSEPH H. KINDLE, Ph.D., Professor of Mathematics, University of Cincinnati DIFFERENTIAL GEOMETRY including 500 SOLVED PROBLEMS By MARTIN LIPSCHUTZ, Ph.D., Professor of Mathematics, University of Bridgeport CALCULUS including 1 1 75 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathemolics, Dickinson College DIFFERENTIAL EQUATIONS including 560 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathematics, Dickinson College SET THEORY and Related Topics including 530 SOLVED PROBLEMS By SEYMOUR IIPSCHUTZ, Ph.D., Assoc. Prof, of Moth., Temple University FINITE MATHEMATICS including 750 SOLVED PROBLEMS By SEYMOUR LIPSCHUTZ, Ph.D., Assoc. Prof, of Math., Temple University MODERN ALGEBRA including 425 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathematics, Dickinson College LINEAR ALGEBRA including 600 SOLVED PROBLEMS By SEYMOUR LIPSCHUTZ, Ph.D., Assoc. Prof, of Math., Temple University MATRICES including 340 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Wothemolics, Diclcinson College PROJECTIVE GEOMETRY including 200 SOLVED PROBLEMS By FRANK AYRES, Jr., Ph.D., Professor of Mathematics, Dickinson College GENERAL TOPOLOGY including 650 SOLVED PROBLEMS By SEYMOUR LIPSCHUTZ, Ph.D., Assoc. Prof, of Math., Temple University GROUP THEORY including 600 SOLVED PROBLEMS By B. BAUMSLAG, B. CHANDLER, Ph.D., Mathematics Dept., New York University VECTOR ANALYSIS including 480 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. ADVANCED CALCULUS including 925 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. COMPLEX VARIABLES including 640 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math,, Rensselaer Polytech. Inst. LAPLACE TRANSFORMS including 450 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. NUMERICAL ANALYSIS including 775 SOLVED PROBLEMS By FRANCIS SCHEID, Ph.D., Professor of Mathematics, Boston University DESCRIPTIVE GEOMETRY including 1 75 SOLVED PROBLEMS By MINOR C. HAWK, Heod of Engineering Graphics Dept., Carnegie Inst, of Tech. ENGINEERING MECHANICS including 460 SOLVED PROBLEMS By W. G. McLEAN, B.S. in E.E., M.S., Professor of Mechanics, Lafayette College and E. W. NELSON, B.S. in M.E., M. Adm. E., Engineering Supervisor, Western Electric Co. THEORETICAL MECHANICS including 720 SOLVED PROBLEMS By MURRAY R. SPIEGEL, Ph.D., Professor of Math., Rensselaer Polytech. Inst. LAGRANGIAN DYNAMICS including 275 SOLVED PROBLEMS By D. A. WELLS, Ph.D., Professor of Physics, University at Cincinnati STRENGTH OF MATERIALS including 430 SOLVED PROBLEMS By WILLIAM A. NASH, Ph.D., Professor of Eng. Mechanics, University of Florida FLUID MECHANICS and HYDRAULICS including 475 SOLVED PROBLEMS By RANALD V. GILES, B.S., M.S. in C.E., Prof, of Civil Engineering, Drexel Inst, of Tech. FLUID DYNAMICS including 100 SOLVED PROBLEMS By WILLIAM F. HUGHES, Ph.D., Professor of Mech. Eng., Carnegie Inst, of Tech. and JOHN A. BRIGHTON, Ph.D., Asst. Prof, of Mech. Eng., Pennsylvania State U. ELECTRIC CIRCUITS including 350 SOLVED PROBLEMS By JOSEPH A. EDMINISTER, M.S.E.E., Assoc. Prof, of Elec. Eng., University of Altron ELECTRONIC CIRCUITS including 160 SOLVED PROBLEMS By EDWIN C. lOWENBERG, Ph.D., Professor of Elec. Eng., University of Nebraska FEEDBACK & CONTROL SYSTEMS including 680 SOLVED PROBLEMS By J. J. DiSTEFANO III, A. R. STUBBERUD, and I. J. WILLIAMS, Ph.D., Engineering Dept., University of Calif., at L.A. TRANSMISSION LINES including 165 SOLVED PROBLEMS By R. A. CHIPMAN, Ph.D., Professor of Electrical Eng., University of Toledo REINFORCED CONCRETE DESIGN including 200 SOLVED PROBLEMS By N. J. EVERARD, MSCE, Ph.D., Prof, of Eng. Mech. t Struc, Arlington State Col. and J. L. TANNER III, MSCE, Technical Consultant, Texas Industries Inc. MECHANICAL VIBRATIONS including 225 SOLVED PROBLEMS By WILLIAM W. SETO, B.S. in M.E., M.S., Assoc. Prof, of Mech. Eng., San Jose State College MACHINE DESIGN including 320 SOLVED PROBLEMS By HALL, HOLOWENKO, LAUGHIIN Professors of Mechanical Eng., Purdue University BASIC ENGINEERING EQUATIONS including 1400 BASIC EQUATIONS By W. F. HUGHES, E. W. GAYLORD, Ph.D., Professors of Mech. Eng., Carnegie Inst,' of Tech. ELEMENTARY ALGEBRA including 2700 SOLVED PROBLEMS By BARNEn RICH, Ph.D., Head of Math. Oepf., Brooklyn Tech. H.S. PLANE GEOMETRY Including 850 SOLVED PROBLEMS By BARNEn RICH, Ph.D., Head of Math. Dept., Brooklyn Tech. H.S. TEST ITEMS IN EDUCATION including 3100 TEST ITEMS By G. J. MOULY, Ph.D., L. E. WALTON, Ph.D., Professors of Education, University of Miami I II